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Modification of Human Variable Domains 



The present application claims priority on EP 01 11 6756.6 filed on July 19 th , 2001, which 
hereby is incoiporated by reference in its entirety. 



Background of the invention 

Because of their high degree of specificity and broad target range, antibodies have found 
numerous applications in a variety of settings in basic research, clinical and industrial use, 
where they serve as tools to selectively recognize virtually any kind of substrate. However, 
despite their versatility there are intrinsic limitations in the use of antibody molecules for 
some important applications. For example, therapeutic or in vivo diagnostic antibody 
fragments require a long serum half-life in human patients to accumulate at the desired target, 
and they must, therefore, be resistant to precipitation and degradation by proteases (Willuda et 
ah, 1999). Industrial applications often demand antibodies, that can function in organic 
solvents, surfactants or at high temperatures — all of which pose severe challenges to the 
stability of these molecules (Dooley et ah, 1998; Harris et ah, 1994). There is also a size 
consideration, especially in clinical applications. Enhanced tumor penetration favors smaller 
molecules, thus making the large size of whole antibodies a potential liability in some 
treatment regimens. Furthermore, the high demand for, and the increasing number of, 
applications of antibodies require more efficient methods for their high-level production. 

Single-chain Fv (scFv) fragments are one antibody format designed to circumvent some of 
these limitations (Bird et ah, 1988; Huston et ah, 1988). The size of these molecules is 
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reduced to the antigen binding part of an antibody, and they contain the variable domains of 
the heavy and light chain connected via a flexible linker. Most scFv fragments can be easily 
obtained from recombinant expression in E. coli in sufficient amounts (Glockshuber et ah, 
1992; Pluckthun et ah, 1996). As production yields of these fragments are influenced by their 
stability, as well as solubility and folding efficiency, considerable efforts have been made to 
identify positions in scFv fragments critical for influencing their expression behavior 
(Knappik & Pluckthun, 1995; Forsberg et ah, 1997; Kipriyanov et ah, 1997; Nieba et ah, 
1997). 

The factors influencing the stability of antibody molecules have been studied mostly with 
scFv fragments (Worn & Pluckthun, 2001). The overall stability of scFv fragments depends 
on the intrinsic structural stability of V L and V H as well as on the extrinsic stabilization- 
provided by their interaction (Worn & Pluckthun, 1999). For some scFvs, the stabilities of 
isolated V H and V L domains, as well as of the whole scFv fragment, have been measured and 
compared recently (Jager et ah, 2001; Jager & Pluckthun, 1999a; Worn & Pluckthun, 1999). 
The V H domain of the anti-HER2 scFv hu4D5-8, which was generated by loop grafting on a 
human V H 3 consensus framework (Carter et ah, 1992; Rodrigues et ah, 1992), shows a free 
energy of unfolding of 14.4 kJ / mol' 1 M" 1 (Jager et ah, 2001). This low thermodynamic 
stability is surprising at first glance, but there are several differences in framework residues of 
the V H 3 consensus sequence introduced after the loop grafting to increase affinity to HER2 
(Carter et ah, 1992). The V H domain IcaH-01 of a catalytic antibody (Ohage et ah, 1999) was 
engineered for stability by converting it to the consensus sequence (Steipe et ah, 1994). 
Because of the frequent usage of V H 3 domains, this overall consensus is heavily biased 
towards the V H 3 consensus. Seven positions were identified and separately exchanged (Wirtz 
& Steipe, 1999). 
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ScFv fragments, as well as complete human antibodies against a broad variety of tailored 
antigens, can now be obtained from several antibody libraries (Griffiths et ah, 1994; Vaughan 
et ah, 1996; Knappik et ah, 2000). The libraries are enriched by panning for antibody 
fragments that bind the desired target molecule, but the selection procedure is biased for 
additional factors such as expression behavior, toxicity of the expressed antibody construct to 
the bacterial host, protease sensitivity, folding efficiency, and stability. There are two 
conceivable solutions to make a diverse library of stable frameworks. The first is to use a 
single stable framework (Holt et ah, 2000; Pini et ah, 1998; Soderlind et ah, 2000). These 
libraries use the germ line gene DP47 (Tomlinson et ah, 1992) as the master framework for 
the V H domain, since this gene is well expressed in bacterial systems (Griffiths et ah, 1994) 
and most frequently expressed in vivo in human individuals (de Wildt et ah, 1999). The~ 
Griffiths library is built from a germline Vh bank using in vitro generated CDR3 and FR4 
sequences (Griffiths et ah, 1994). The diversity has been reached by introducing various point 
mutations in the CDRs (Holt et ah, 2000; Pini et ah, 1998) or sampled CDRs from in vivo- 
processed gene sequences (Soderlind et ah, 2000). 

The second possibility to achieve a structurally diverse library of stable frameworks is to 
optimize the human consensus antibody frameworks further. Different frameworks with 
conformational changes for framework 1 conformations (Honegger & Pluckthun, 2001a; Jung 
et ah, 2001; Saul & Poljak, 1993) may access a different range of CDR2 conformations (Saul 
& Poljak, 1993), while different framework 4 sequences affect CDR3 conformation. The 
Human Combinatorial Antibody Library (HuCAL, Knappik et ah, 2000) consists of 
combinations of seven Vh and seven Vl synthetic consensus frameworks connected via a 
linker region forming 49 master genes (Knappik et ah, 2000). 
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The basis for this library is a set of consensus sequences of the framework regions of the 
major V H - and V L - subfamilies (V H 1, V H 2, V H 3, V H 4, V H 5, and V H 6, VkI, Vk2, Vk3 ? Vk4, 
V/U, VX2 and VA3). These subfamilies were identified from known germline sequences 
(VBASE, Cook & Tomlinson, 1995) with the V H 1 subfamily further divided into V H la and 
Vnlb because of different CDR-H2 conformations. For each of the subfamilies, a consensus 
sequence for the framework regions was calculated from a database of all known rearranged 
antibody sequences belonging to that subfamily. 

These 14 consensus sequences ideally represent the structural repertoire of human variable 
domain frameworks. 

These consensus sequences containing germline CDR1 and CDR2 sequences of the 
corresponding germline variable domain and identical CDR3s were used for expression 
studies (Knappik et aL, 2000). Thus, it could be shown that the individual VH and VL 
domains are well expressed and stable in E.coli. However, these studies, and studies on their 
individual perfomance in recombinant libraries (Hanes et aL, 2000) showed that nevertheless 
there are striking differences between the individual variable domains when compared to each 
other. 

Enhanced overall expression and stability of antibodies or fragments thereof is highly 
desirable for most applications of antibody libraries. 

Thus, the technical problem of the present invention is to improve the relative stability, 
overall expression and solubility of antibodies or fragments thereof. The solution to the 
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above mentioned technical problem is achieved by providing the embodiments characterized 
in the claims and disclosed hereinafter. 



The technical approach of the present invention i.e. modifying one or more framework 
residues in a human variable heavy or light chain antibody domain of a particular subclass 
with reference to a Vh or a Vl domain, respectively, of another subclass, is neither provided 
nor suggested by the prior art. 

SUMMARY OF THE INVENTION: 

The present invention provides antibodies having, inter alia, a modified framework region, 
using methods described and contemplated herein. Methods for mutating nucleic acid 
sequences are well known to the practitioner skilled in the art, including but not limited to 
cassette mutagenesis, site-directed mutagenesis, mutagenesis by PGR (see for example 
Sambrook et al., 1989; Ausubel et aL, 1999). 

In one aspect ; the present invention provides isolated polypeptides (and isolated nucleic acid 
sequences encoding the same) that contain a Vh domain selected from the group consisting of 
(i) a Vh domain belonging to the Vnla subclass, wherein the Vh domain contains an amino 
acid residue F at position 29 and/or L at position 89; (ii) a Vh domain belonging to the Vnlb 
subclass, wherein the Vh domain contains the amino acid residue L at position 89; (iii) a Vh 
domain belonging to the Vh2 subclass, wherein the Vh domain contains at least one amino 
acid residue selected from the group consisting of G at position 16, V at position 44, A at 
position 47, G at position 76, F at position 78, Y at position 90, R at position 97, E at position 
99, wherein if R is at position 97, then E is at position 99; (iv) a V H domain belonging to the 
V H 4 subclass, wherein the Vh domain contains at least one amino acid residue selected from 



WO 03/008451 PCT/EP02/08094 

6 

the group consisting of G at position 16, A at position 47, F at position 78, Y at position 90, R 
at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99; (v) 
a Vh domain belonging to the V H 5 subclass, wherein the Vh domain contains at least one 
amino acid residue selected from the group consisting of L at position 89, R at position 97, 
and E at position 99, wherein if R is at position 97, then E is at position 99; and (vi) a Vh 
domain belonging to the Vr6 subclass, wherein the Vh domain contains at least one amino 
acid residue selected from the group consisting of V at position 5, G at position 16, I at 
position 58, F at position 78, Y at position 90 and R at position 97, and E at position 99, 
wherein if R is at position 97, then E is at position 99. 

The present invention also provides isolated polypeptides (and isolated nucleic acid sequences 
encoding the same) that contain a V L domain selected from the group consisting of (i) a Vl 
domain belonging to the V l k2 subclass, wherein the V L domain contains the amino acid 
residue R at position 18, and wherein if R is at position 18, then T is at position 92; and (ii) a 
Vl domain belonging to the ViA,l subclass, wherein the Vl domain contains the amino acid 
residue K at position 47. 

The nucleic acid sequences encoding the polypeptides of the invention can be used, e.g., for 
the construction of libraries of antibodies or fragments thereof. Libraries of antibodies or 
fragments thereof have been described in various publications (see, e.g., Vaughan et aL, 1996; 
Knappik et aL 9 2000; US 6,300,064, which are incorporated by reference in their entirety), 
and are well-known to one of ordinary skill in the art. 

In the context of the present invention, the term "Vh domain" refers to the variable part of the 
heavy chain of an immunoglobulin molecule. The term "Vh. . . subclass 9 ' includes the subclass 
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defined by the corresponding "Vr..." consensus sequence taken from the HuCAL (Vnla, 
V H lb, V H 2, Vh3 5 V h 4, Vh5, and V H 6 (Knappik et ah, 2000) generated as described above. In 
this context, the term "subclass" refers to a group of variable domains sharing a high degree 
of identity and similarity represented by a consensus sequence of the major V H -subfamilies, 
wherein the term "subfamily" is used as a synonym for "subclass." In the context of the 
present invention, the term "consensus sequence" refers to the HuCAL consensus genes. The 
determination whether a given Vh domain is "belonging to a Vh subclass" is made by 
alignment of the Vh domain with all known human Vh germline segments (VBASE, Cook & 
Tomlinson, 1995) and determination of the highest degree of homology using a homology 
search matrix such as BLOSUM (Henikoff & Henikoff, 1992). Methods for determining 
homologies and grouping of sequences according to homologies are well known to one of 
ordinary skill in the art. The grouping of the individual germline sequences into subclasses is~ 
done according to Knappik et al., (2000). 

In the context of the present invention the term "V L domain" refers to the variable part of the 
light chain of an immunoglobulin molecule. The term "Vl. subclass" refers to the subclass 
defined by the corresponding Vl... consensus sequence taken from the HuCAL (VkI, Vk2 ? 
Vk3 and Vk4 as well as VA1, VX2 and VA3; Knappik et aL, 2000) generated as described 
above. 

In this library, a consensus sequence for each of the major VL-subfamilies was generated from 
known antibody sequences (VBASE, Cook & Tomlinson, 1995). In the context of the present 
invention, the numbering of the amino acid residues is according to the structurally adjusted 
scheme of Honegger & Pliickthun (2001b). 
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In the context or the present invention, the term "antibody" is used as a synonym for 
"immunoglobulin". Antibodies or fragments thereof according to the present invention may 
be Fv (Skerra & Pltickthun, 1988), scFv (Bird et al., 1988; Huston et aL, 1988), disulfide- 
linked Fv (Glockshuber et aL, 1992; Brinkmann et aL, 1993), Fab, (Fab')2 fragments, single 
V H domains or other fragments well-known to the practitioner skilled in the art, which 
comprise at least one variable domain of an immunoglobulin or immunoglobulin fragment 
and have the ability to bind to a target. 

DETAILED DESCRIPTION 

u 

The invention provides novel immunoglobulin sequences and methods for making the same. 

The present inventors surprisingly discovered a scheme for optimizing certain framework 

u 

regions of an immunoglobulin of any variable heavy or light chain subclass, using the" 
sequences of another subclass (i.e., subfamily) as a reference point. The present invention, 
also relates to a method for the further modification of such optimized human variable 
domains comprising the steps of: (i) identifying for said domain the corresponding amino acid 
consensus sequence selected from the group of VH consensus sequences consisting of VHla, 
VHlb, VH2, VH4, VH5, and VH6, and (ii) substituting one or more codons corresponding to 
amino acid residues of said consensus sequence into a corresponding position(s) in said 
nucleic acid sequence of said domain. 

The following procedure describes a generally applicable method for improving the properties 
of any given human immunoglobulin heavy chain variable domain while keeping binding 
activity. (This method can be readily modified, using the guidance provided herein, to 
improve the properties of any given human immunoglobulin light chain variable domain). 
The first task is to compare each residue of the given domain to different subsets of 
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immunoglobulin sequences. As the binding activity preferably is retained, residues of CDR1 
(25-40), CDR2 (57-77), CDR3 (109-137) and the outer loop (84-87) are generally not 
considered (numbering scheme according to Honegger and Pluckthun (2001b)). After 
determination of the framework 1 class, the subtype-determining (6, 7, 9, 10) and subtype- 
corresponding (19, 74, 78, 93) residues are compared to the consensus of sequences falling 
into the same class (Honegger and Pluckthun, 2001a). The other residues are then compared 
to the consensus sequences of the Vh domains with favorable properties (families 1, 3 and 5) 
(see Example 1, Knappik et al., 2000). Next, the differences in residues are analyzed using 
structure models (see Example 2). Mutations that increase the expression yieldof soluble 
protein and/or thermodynamic stability, as seen in this study, include: (i) mutations which 
replace a non-glycine residue in a loop with a positive phi-angle to glycine, (ii) mutations of 
residues in a p-strand with low p-sheet propensity to a residue with high p-sheet propensity, 
(iii) mutations of solvent exposed hydrophobic residues to hydrophilic ones, and (iv) 
replacement of residues with unsatisfied H-bonds. 

In a preferred embodiment, the present invention relates to a method for the modification of 
certain human V H domains belonging to a V H subclass which is not Vr3, comprising the steps 
of: (a) identifying certain amino acid residues of said Vh domain being different compared to 
the corresponding amino acid residues of the HuCAL V H 3 domain, (b) replacing at least one 
of the differing amino acid residues by the corresponding amino acid residues of the HuCAL 
Vh3 domain, provided that the replacing amino acid residue is not the consensus amino acid 
residue of said subclass. 

This basic method is, in principle, also applicable to Vl domains. For example, V K domains 
can be compared to the consensus sequence of V K 3, as this domain displays the highest 
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thermodynamic stability and expression yield of V K domains. The physical principles for 
rational design V\ domains are the same as with Vh domains described above. 

hi a preferred embodiment, the present invention relates to an isolated polypeptide comprising 
a Vh domain belonging to the Vnla subclass, wherein said Vh domain comprises an amino 
acid residue F at position 29 and L at position 89. 

In yet a further embodiment, the invention relates to an isolated polypeptide comprising a Vh 
domain belonging to the Vnlb subclass, wherein said Vh domain comprises the amino acid 
residue L at position 89. 

In a further preferred embodiment, the invention relates to an isolated polypeptide comprising 
a Vh domain belonging to the V H 2 subclass, wherein said Vh domain comprises at least one 
amino acid residue selected from the group consisting of G at position 16, V at position 44, A 
at position 47, G at position 76, F at position 78, Y at position 90, R at position 97, E at 
position 99, wherein if R is at position 97, then E is at position 99. 

In yet a further preferred embodiment, the invention relates to an isolated polypeptide 
comprising a V H domain belonging to the Vh4 subclass, wherein said Vh domain comprises at 
least one amino acid residue selected from the group consisting of G at position 16, A at 
position 47, F at position 78, Y at position 90, R at position 97, and E at position 99, wherein 
if R is at position 97, then E is at position 99. 

In yet a further preferred embodiment, the invention relates to an isolated polypeptide 
comprising a V H domain belonging to the Vh5 subclass, wherein said Vh domain comprises at 
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least one amino acid residue selected from the group consisting of L at position 89, R at 
position 97, and E at position 99, wherein if R is at position 97, then E is at position 99. 
In a further preferred embodiment, the present invention relates to an isolated polypeptide 
comprising a Vh domain belonging to the Vr6 subclass, wherein said V H domain comprises at 
least one amino acid residue selected from the group consisting of V at position 5, G at 
position 16, I at position 58, F at position 78, Y at position 90 and R at position 97, and at 
position 99, wherein if R is at position 97, then E is at position 99. 

In yet a further preferred embodiment, the invention relates to an antibody or functional 
fragment thereof comprising any V H domain according to the present invention. Further 
preferred is a library of antibodies or functional fragments thereof comprising one or more 
antibodies or functional fragments thereof according to the present invention. 
A library according to the present invention could be generated, starting from the HuCAL 
library (Knappik et al. ? 2000) by optimizing one or more of the VH and/or VL consensus 
sequences in accordance with the teaching of the present invention, and. by introducing 
diversity into at least one CDR region in said optimized sequence, e.g. by using 
oligonucleotide cassettes synthesizedusing trinucleotide-directed mutagenesis as described in 
Knappik et al., 2000. 

hi yet a further preferred embodiment, the present invention relates to an isolated polypeptide 
comprising a V L domain belonging to the V l k2 subclass, wherein said Vl domain comprises 
the amino acid residue R at position 18, and wherein R is at position 18, then T is at position 
92. 
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In a further preferred embodiment, the present invention relates to an isolated polypeptide 
comprising a Vl domain belonging to the VlXI subclass, wherein said Vl domain comprises 
the amino acid residue K at position 47. 

In yet a further preferred embodiment, the present invention relates to an antibody or a 
functional fragment thereof comprising a Vl domain according to the present invention. 

In a most preferred embodiment, the present invention relates to libraries of antibodies or 
functional fragments thereof comprising one or more antibodies or functional fragments 
thereof according to the present invention. 

In a further preferred embodiment, the present invention relates to a method for the 
modification of a human Vh domain belonging to the Vnla subclass by generating a modified 
Vh domain comprising at least one amino acid residue exchange taken from the list of: (a) 29 
to F and (b) 89 to L. 

In yet a further embodiment, the invention provides for a method for the modification of a 
human Vh domain belonging to the Vnlb subclass by generating a modified Vh domain 
comprising the amino acid residue exchange: 89 to L. 

In a further embodiment, the invention relates to a method for the modification of a human Vh 
domain belonging to the Vh2 subclass by generating a modified Vh domain comprising at 
least one amino acid residue exchange taken from the list of: (a) 16 to G; (b) 44 to V; (c) 47 to 
A; (d) 76 to G; (e) 78 to F; (f) 97 to R, provided that the amino acid residue 99 is, or is 
exchanged to E; and (g) 99 to E. Further preferred is a method for the modification of a V H 
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domain belonging to the V H 2 subclass, by generating a modified Vh domain comprising the 
amino acid residue exchange 90 to Y. 

In a further preferred embodiment, the invention relates to a method for the modification of a 
human V H domain belonging to the Vh4 subclass by generating a modified Vh domain 
comprising at least one amino acid residue exchange taken from the list of: (a) 16 to G; (b) 44 
to V; (c) 47 to A; (d) 76 to G; (e) 78 to F; (f) 97 to R, provided that the amino acid residue 99 
is, or is exchanged to E; and (g) 99 to E. Further preferred is a method for the modification of 
a human Vh domain belonging to the V H 4 subclass, by generating a modified V H domain 
comprising the amino acid residue exchange 90 to Y. 

In a further preferred embodiment, the invention provides for a method for the modification 
of a human Vh domain belonging to the Vh5 subclass by generating a modified Vh domain 
comprising at least one amino acid residue exchange taken from the list of: (a) 77 to R; (b) 89 
to L; (c) 97 to R, provided that the amino acid residue 99 is, or is exchanged to E; and (d) 99 
to E. 

In yet a further embodiment, the invention provides for a method for the modification of a 
human Vh domain belonging to the Vh6 subclass by generating a modified V H domain 
comprising at least one amino acid residue exchange taken from the list of: (a) 5 to V; (b) 16 
to G; (c) 44 to V; (d) 58 to I; (e) 72 to D; (f) 76 to G; (g) 78 to F and (h) 97 to R, provided that 
the amino acid residue 99 is, or is exchanged to E. Further preferred is a method for the 
modification of a V H domain belonging to the Vr6 subclass, by generating a modified Vh 
domain comprising the amino acid residue exchange 90 to Y. 



WO 03/008451 PCT/EP02/08094 

14 

In another embodiment, the present invention relates to a method for the modification of a Vh 
domain, wherein 2 or more amino acid residues are exchanged. 

In a further embodiment, the present invention provides for a method for the modification of a 
Vh domain comprising the steps of (i) providing a nucleic acid molecule encoding said Vh 
domain; (ii) mutating said nucleic acid molecule resulting in a modified nucleic acid molecule 
encoding said modified Vh domain. 

In a preferred embodiment, the present invention relates to a method for obtaining a 
polypeptide according to the present invention, substituting in a Vnla subclass domain at least 
one amino acid residue selected from the group consisting of F at position 29 and L at 
position 89. 

In yet a further preferred embodiment, the present invention relates to a method for obtaining 
a polypeptide according to the present invention, comprising the step of substituting in a V H lb 
subclass domain the amino acid residue L at position 89. 

In a further preferred embodiment, the present invention relates to a method for obtaining a 
polypeptide according to the present invention, comprising the step of substituting in a Vh2 
subclass domain at least one amino acid residue selected from the group consisting of G at 
position 16, V at position 44, A at position 47, G at position 76, F at position 78, R at position 
97, and E at position 99, wherein if R is at position 97, then E is at position 99. Further 
preferred is a method for obtaining the polypeptide according to the present invention, 
comprising the step of substituting in a Vh2 subclass domain the amino acid residue Y at 
position 90. 
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In a further preferred embodiment, the present invention relates to a method for obtaining the 
polypeptide according to the present invention, comprising the step of substituting in a V H 4 
subclass domain at least one amino acid residue selected from the group consisting of G at 
position 16, V at position 44, A at position 47, G at position 76, F at position 78, R at position 
97, and E at position 99, wherein if R is at position 97, then E is at position 99. Further 
preferred is a method for obtaining the polypeptide according to the present invention, 
comprising the step of substituting in a V H 4 subclass domain the amino acid residue Y at 
position 90. 

In yet a further preferred embodiment, the present invention relates to a method for obtaining 
the polypeptide according to the present invention, comprising the step of substituting in a 
Vh5 subclass domain „at least one amino acid residue selected from the group consisting of R 
at position 77, L at position 89, R at position 97, and E at position 99, wherein if R is at 
position 97, then E is at position 99. 

In a further preferred embodiment, the present invention relates to a method for obtaining a 
polypeptide according to the present invention, comprising the step of substituting in a V H 6 
subclass domain at least one amino acid residue selected from the group consisting of V at 
position 5, G at position 16, V at position 44, 1 at position 58, D at position 72, G at position 
76, F at position 78, R at position 97, and E is at position 99, wherein if R is at position 97, 
then E is at position 99. Further preferred is a method for obtaining a polypeptide according to 
the present invention, comprising the step of substituting in a Vh6 subclass domain the amino 
acid residue Y at position 90. 
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In a further preferred embodiment, the present invention relates to a method for obtaining a 
polypeptide according to the present invention, wherein 2 or more amino acid residues are 
substituted. 

In yet a further preferred embodiment, the present invention relates to a method for obtaining 
the polypeptide according to the present invention, comprising the step of substituting in a of 
a Vlk2 subclass domain at least one amino acid residue selected from the group consisting of 
S at position 12, Q at position 45, and R at position 18, and wherein R is at position 18, then T 
is at position 92. 

In yet a further preferred embodiment, the present invention relates to a method for obtaining 
the polypeptide according to the present invention, comprising the step of substituting in a 
V L X1 subclass domain at least one amino acid residue selected from the group consisting of K 
at position 47. 

In a further preferred embodiment, the present invention relates to a method for obtaining a 
polypeptide according to the present invention, comprising the step of substituting in a V^Xl, 
VlX2 and VlA3 domain the amino acid residue P at position 8. Further preferred is a method 
for obtaining a polypeptide according to the present invention, wherein P is at position 8, and 
further comprising the substitutions S at positions 7 and 9. 

In a further preferred embodiment, the present invention relates to a method according to the 
present invention, wherein 2 or more amino acid residues are substituted. 
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In a further preferred embodiment, the present invention relates to a method for obtaining a 
polypeptide according to the present invention farther comprising the step of expressing a 
modified nucleic acid molecule. 

In a further preferred embodiment, the present invention relates to an isolated nucleic acid 
molecule encoding an inventive V H domain, an antibody or a functional fragment thereof, as 
disclosed or contemplated herein. 

In a further preferred embodiment, the present invention relates to an isolated nucleic acid 
molecule encoding an inventive V L domain, an antibody or a functional fragment thereof, as 
disclosed or contemplated herein. 

In a further preferred embodiment, the present invention relates to a method for producing a 
V L domain, antibody or a functional fragment thereof, as described or contemplated herein, 
comprising the step of expressing an isolated nucleic acid molecule of the present invention. 

The invention also provides for conservative amino acid variants of the molecules of the 
invention. Variants according to the invention also may be made that conserve the overall 
molecular structure of the encoded proteins. Given the properties of the individual amino 
acids comprising the disclosed protein products, some rational substitutions will be 
recognized by the skilled worker. Amino acid substitutions, i.e. "conservative substitutions," 
may be made, for instance, on the basis of similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved. 
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For example: (a) nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, 
valine, proline, phenylalanine, tryptophan, and methionine; (b) polar neutral amino acids 
include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; (c) positively 
charged (basic) amino acids include arginine, lysine, and histidine; and (d) negatively charged 
(acidic) amino acids include aspartic acid and glutamic acid. Substitutions typically may be 
made within groups (a)-(d). In addition, glycine and proline may be substituted for one 
another based on their ability to disrupt a-helices. Similarly, certain amino acids, such as 
alanine, cysteine, leucine, methionine, glutamic acid, glutamine, histidine and lysine are more 
commonly found in ahelices, while valine, isoleucine, phenylalanine, tyrosine, tryptophan and 
threonine are more commonly found in /3-pleated sheets. Glycine, serine, aspartic acid, 
asparagine, and proline are commonly found in turns. Some preferred substitutions may be 
made among the following groups: (i) S and T; (ii) P and G; and (iii) A, V, L and 1. Given 
the known genetic code, and recombinant and synthetic DNA techniques, the skilled scientist 
readily can construct DNAs encoding the conservative amino acid variants. 

As used herein, "sequence identity" between two polypeptide sequences indicates the 
percentage of amino acids that are identical between the sequences. "Sequence similarity" 
indicates the percentage of amino acids that either are identical or that represent conservative 
amino acid substitutions. 

The invention also provides nucleic acids that hybridize under high stringency conditions to 
the Vh and/or V L domains, antibodies or functional fragments thereof, according to the 
present invention. As used herein, highly stringent conditions are those, which are tolerant of 
up to about 5-20% sequence divergence, preferably about 5-10%. Without limitation, 
examples of highly stringent (-10°C below the calculated Tm of the hybrid) conditions use a 
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wash solution of 0.1 X SSC (standard saline citrate) and 0.5% SDS at the appropriate Ti 
below the calculated Tm of the hybrid. The ultimate stringency of the conditions is primarily 
due to the washing conditions, particularly if the hybridization conditions used are those, 
which allow less stable hybrids to form along with stable hybrids. The wash conditions at 
higher stringency then remove the less stable hybrids. A common hybridization condition 
that can be used with the highly stringent to moderately stringent wash conditions described 
above is hybridization in a solution of 6 X SSC (or 6 X SSPE), 5 X Denhardt's reagent, 0.5% 
SDS, 100 ng/ml denatured, fragmented salmon sperm DNA at an appropriate incubation 
temperature Ti. See generally Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d 
edition, Cold Spring Harbor Press (1989)) for suitable high stringency conditions. 

Stringency conditions are a function of the temperature used in the hybridization experiment 
and washes, the molarity of the . monovalent cations in the hybridization solution and in the 
wash solution(s) and the percentage of formamide in the hybridization solution. In general, 
sensitivity by hybridization with a probe is affected by the amount and specific activity of the 
probe, the amount of the target nucleic acid, the detectability of the label, the rate of 
hybridization, and the duration of the hybridization. The hybridization rate is maximized at a 
Ti (incubation temperature) of 20-25°C below Tm for DNArDNA hybrids and 1(M5°C below 
Tm for DNA:RNA hybrids. It is also maximized by an ionic strength of about 1.5M Na+. 
The rate is directly proportional to duplex length and inversely proportional to the degree of 
mismatching. 

Specificity in hybridization, however, is a function of the difference in stability between the 
desired hybrid and "background" hybrids. Hybrid stability is a function of duplex length, 
base composition, ionic strength, mismatching, and destabilizing agents (if any). 
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The Tm of a perfect hybrid may be estimated for DNA:DNA hybrids using the equation of 
Meinkoth et al (1984), as 

Tm = 81.5°C + 16.6 (log M) + 0.41 (%GC) - 0.61 (% form) - 500/L 
and for DNA:RNA hybrids, as 

Tm - 79.8°C + 18.5 (log M) + 0.58 (%GC) - 1 1.8 (%GC)2 - 0.56(% form) - 820/L 
where M, molarity of monovalent cations, 0.01-0.4 M NaCl, 

%GC, percentage of G and C nucleotides in DNA, 30%-75%, 

% form, percentage formamide in hybridization solution, and 

L, length hybrid in base pairs. 

Tm is reduced by 0.5-1. 5°C (an average of 1°C can be used for ease of calculation) for each 
1% mismatching. 

The Tm may also be determined experimentally. As increasing length of the hybrid (L) in the 
above equations increases the Tm and enhances stability, the full-length rat gene sequence can 
be used as the probe. 

Filter hybridization is typically carried out at 68°C, and at high ionic strength (e.g., 5 - 6 X 
SSC), which is non- stringent, and followed by one or more washes of increasing stringency, 
the last one being of the ultimately desired high stringency. The equations for Tm can be 
used to estimate the appropriate Ti for the final wash, or the Tm of the perfect duplex can be 
determined experimentally and Ti then adjusted accordingly. 
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In a further preferred embodiment, the present invention relates to a method for producing a 
V H domain, antibody or a functional fragment thereof, as described or contemplated herein, 
comprising the step of expressing an isolated nucleic acid molecule of the present invention. 

In particular, such method comprises the steps of: (i) providing a nucleic acid molecule 
encoding a Vh domain; (ii) mutating said nucleic acid molecule resulting in a modified 
nucleic acid molecule encoding a modified V H domain comprising at least one amino acid 
residue exchange. Methods for mutating nucleic acid sequences are well known to the 
practitioner skilled in the art, encluding but not limited to cassette mutagenesis, site-directed 
mutagenesis, mutagenesis by PGR (see for example Sambrook et al., 1989; Ausubel et al., 
1999). 

Further preferred is a vector comprising an isolated nucleic acid molecule according to the 
present invention. 

In yet a further preferred embodiment, the invention relates to a host cell harboring an isolated 
nucleic acid molecule according to the present invention or a vector according to the present 
invention. 

In a further preferred embodiment, the V H domains according to the present invention can be 
used for all applications of antibodies including but not limited to the construction, 
generation, expression and screening of antibody libraries. 
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In a further preferred embodiment, the Vl domains according to the present invention can be 
used for all applications of antibodies including but not limited to the construction, 
generation, expression and screening of antibody libraries 

In yet a further preferred embodiment, the present invention relates to an antibody or a 
functional fragment thereof (and methods of making the same), that contains any combination 
of a Vh and Vl domain described herein. For example, an antibody may comprise (i) a Vh 
domain belonging to the Vnla subclass, wherein said Vh domain comprises an amino acid 
residue F at position 29 and/or L at position 89; and (ii) a V L domain belonging to the V l k2 
subclass, wherein said Vl domain comprises one or more of the following substitutions: S at 
position 12, Q at position 45, or R at position 18, provided that if R is at position 18, then T is 
at position.92. 

In still a further preferred embodiment, the present invention relates to a library of antibodies 
or functional fragments thereof comprising one or more antibodies or functional fragments 
thereof, according to the present invention. 

In a further preferred embodiment, the present invention relates to an isolated nucleic acid 
molecule encoding an antibody or functional fragment thereof according to the present 
invention. 



WO 03/008451 

Figure captions 



23 



PCT/EP02/08094 



Figure 1. Determination of apparent molecular mass of isolated V H and V L domains. Gel 

filtration runs were performed in 50 mM sodium-phosphate (pH 7.0) and 500 mM NaCl of (a) 
isolated human consensus Vh domains (5 \±M) on a Superdex-75 column with Vh3 (solid line) 
and V H la (dotted line) and Vnla in the presence of 0.9 M GdnHCl (long dashed line); (b) 
isolated V K domains (50 jJVt) on a Superose-12 column with V K 1 (solid), V K 2 (long dashed), 
V K 3 (dotted) and Vk4 (short dashed line); and (c) isolated domains (5 pM) on a TSK 
column with VJ (solid), V x 2 (long dashed) and V x 3 (dotted line). Arrows indicate elution 
volumes of molecular mass standards: carbonic anhydrase (29 kDa), and cytochrome c 
(12.4 kDa). (d) Equilibrium sedimentation of V K 3 at 19,000 rpm with a detection wavelength 
of 280 nm. The solid line was obtained from fitting of the data to a single species, and a 
molecular weight of 13616 Da was calculated. The residuals of the fit are scattered randomly, 
indicating that the assumption of the monomeric state is valid. 

Figure 2, Overlay of GdnHCl denaturation curves of Vh domains (a) Vnla (filled circles), 
Vnlb (open squares), Vh3 (filled squares) and Vh5 (open circles), (b) V H 2 (filled circles), 
Vh4 (open squares) and V H 6 (filled squares). All unfolding transitions (a and b) were 
measured by following the change in emission maximum as a function of denaturant 
concentration at an excitation wavelength of 280 nm. 

Figure 3. Overlay of GdnHCl denaturation curves of V L domains (a) V K domains with 
V K 1 (filled circles), V K 2 (filled squares), V K 3 (open squares) and V K 4 (open circles) and (b) V x 
domains with V x l (filled squares), V%2 (filled circles), V x 3 (open squares). All unfolding 
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transitions (a and b) were measured by following the change of fluorescence intensity as a 
function of denaturant concentration at an excitation wavelength of 280 nm. 

Figure 4. Model structure of a scFv fragment consisting of human consensus V K 3 (PDB 
entry: 1DH5) and V H 3 domain (PDB entry: 1DHU). (a) Secondary structure with V K 3 on the 
left and V H 3 on the right side (b) Marked for charged residues (grey: Arg, Lys and His; black: 
Asp and Glu). At the base of each domain is an accumulation of charged residues, the charge 
clusters of V L and V H domains, (c) Hydrophobic core residues: Above the conserved Trp43 
(light grey) is the upper core (dark grey) and below the lower core (black), see text for details, 
(d) Positions possibly influencing folding efficiency are shown in light grey, see text for 
details. All images were generated using the program MOLMOL (Koradi et al., 1996). 

Figure 5. Detailed view of the charge cluster of the human consensus (a) V H 3 and (b) V K 3 
family with hydrogen bonds. Images were generated using the program MOLMOL (Koradi et 
al., 1996). 

Figure 6. Detailed view of the upper core residues. Superposition of (a) V H 4, (b) V H la and 
(c) V H 5, each in light grey, with V H 3 in black and (d) VJ in light grey with V K 3 in black, see 
text for details. The conserved Trp43 is shown. Residues 4, 80 and 82 are not shown, as they 
do not contribute to the packing differences discussed in the text. Images were generated 
using the program MOLMOL (Koradi et al., 1996). 

Figure 7. Detailed view of the lower core residues that correspond to framework 1 
classification. Superposition of (Aa) V H la (light grey) and V H 3 (black) (Bb) V H 4 (light grey) 
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and V H 3 (black) and (c) V x l (light grey) and V K 3 (black), see text for details. The conserved 
Trp43 is shown. Images were generated using the program MOLMOL (Koradi et al., 1996). 

Figure 8. Analytical gel filtration of scFv fragments (5 \xM) on a Superdex-75 column in 
50 niM sodium-phosphate (pH 7.0) and 500 mM NaCl: (a) H3k3 (solid line), H4k3 (long- 
dashed line), HlaK3 (short-dashed line) and HlaK3 in the presence of 1 M GdnHCl (short- 
dashed line), (b) H3k3 (solid line), H3k1 (long-dashed line), H3X1 (short-dashed line) and 
B.3X1 in the presence of 1 M GdnHCl (short-dashed line). Arrows indicate elution volumes of 
molecular mass standards: bovine serum albumin (66 kDa), carbonic anhydrase (29 kDa), and 
cytochrome c (14 kDa). 

Figure 9. Overlay of GdnHCl denaturation curves to illustrate different cases of 
interface stabilization. In each panel the scFv fragment (filled squares) and accompanying 
isolated V H (open squares) and V L (open circles) domains are shown. All unfolding transitions 
in (a) with H5k3, (b) with HlaK3, (c) with H3k1 and (d) with H3k2 were measured by 
following the change in emission maximum (in case of scFv fragments and Vh domains) or 
fluorescence intensity (in case of V L domains) as a function of denaturant concentration at an 
excitation wavelength of 280 nm. 

Figure 10. Overlay of GdnHCl denaturation curves to illustrate the role of different L- 
CDR3 in interface stabilization in V* domains. In (a) with H3A1 with the l-like L-CDR3 
and (b) with H3A,1 with the K-like L-CDR3 the scFv fragments (filled squares) and 
constituent isolated V H 3 (open squares) and V*l (open circles) domains are shown. As the 
isolated V x domains with the K-like CDR3 show non-reversible behavior, in (b) the 
renaturation curve of VJ is also shown (filled circles). All unfolding transitions were 
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measured by following the change in emission maximum (in case of scFv fragments and V H 
domains) or fluorescence intensity (in case of V L domains) as a function of denaturant 
concentration at an excitation wavelength of 280 nm. 

Figure 11. Analytical gel filtration of 2C2-wt, 2C2-all, 6B3-wt and 6B3-all in 50 mM 
sodium-phosphate (pH 7.0) and 500 mM NaCl on a Superdex-75 column at a concentration of 
5 |LiM. 6B3-wt (long-dashed line) and 6B3-all (dotted line) show a similar elution volume. 
Arrows indicate elution volumes of molecular mass standards: bovine serum albumin (66 
kDa), carbonic anhydrase (29 kDa), and cytochrome c (12.4 kDa). The mutations earned by 
2C2-all and 6B3~all are listed in Table 7 and Figure 12. 

Figure 12. Overlay of GdnHCl denaturation curves of (a) of 2C2-wt, 2C2-all, 6B3-wt and 
6B3-all, (b) single mutations (abbreviations used: a = Q5V, b = S16G, c = T58I, d = V72D, e 
= S76G, f = S90Y and all = abedef) and (c) multiple mutations to the consensus of V H 
domains with favorable properties and (d) mutations (abbreviations used: g = PI OA and gh = 
P10A+V74F) to the framework 1 subtype HI exemplified with the scFv 2C2. In (b), (c) and 
(d) the bold solid line and the bold dotted line represent the fits ( Jager et al., 2001) of the 
experimental data shown in (a) of 2C2-wt and 2C2-all, respectively. All unfolding transitions 
were measured by following the change in emission maximum as a function of denaturant 
concentration at an excitation wavelength of 280 nm. 

Figure 13. Aligned HuCAL V H sequences. The amino acids are shaded according to residue 
type: aromatic residues (Tyr, Phe, Trp), hydrophobic residues (Leu 5 lie, Val, Met, Cys, Pro, 
Ala), uncharged hydrophilic residues (Ser, Thr, Gin, Asn, Gly), acidic residues (Asp, Glu), 
basic residues (Arg, Lys; His). Residues that show correlated sequence differences between 
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the groups of V H domains with favorable properties (V H la, V H lb, V H 3, V H 5) and V H domains 
with less favorable properties (V H 2, V H 4, V H 6) indicated by white boxes. Numbering scheme 
is according to Kabat et al. (1991) and Honegger & Pliickthun (2001b). 

Figure 14. Overview of the single mutations to the consensus of those V H domains with 
favorable properties. In the middle of the figure a model scFv fragment consisting of V H 6 
(black ribbon, PDB entry: 1DHZ) and V l k3 domain (gray ribbon, PDB entry: 1DH5) is 
shown with the single mutations indicated by arrows, that point to enlargements of the single 
mutations. All images were generated using the program MOLMOL (Koradi et al., 1996). 
Numbering scheme is according to Honegger & Pliickthun (2001b). 

Figure 15. Overview of framework 1 subtype III determining residues (6, 7 and 10) and 

correlated residues (19, 74, 78, 93) (a) in the wild type V H 6 domain (PDB entry: 1DHZ) and 
(b) in the model of the double mutated form with the changes PI OA and V74F. (c) Ribbon 
representation of the V H 6 domain with black frame indicating the enlarged area depicted in 
(a) and (b). All images were generated using the program MOLMOL (Koradi et al, 1996). 
Numbering scheme according to Honegger & Pliickthun (2001b). 

Figure 16. Comparison of the binding activities of (a) 2C2-wt and 2C2-all and (b) 6B3-wt 
and 6B3-all. BIAcore experiments are shown, with resonance units plotted against time after 
injection of different scFv concentrations over an antigen-coated chip. Solid lines indicate 
wild-type scFv fragments and dotted lines indicate scFv fragments carrying all six mutations 
toward the consensus of favorable V H domains, hi (a) 2C2-wt and 2C2-all at concentrations of 
1.25, 0.63, 0.31 and 0.16 uM and in (b) 6B3-wt and 6B3-all at concentrations of 1.25, 0.63, 
0.31, 0.16 and 0.08 jaM are plotted. 
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Figure 17. Competition BIAcore analysis of 6B3-wt and 6B3-all. (a) 6B3-wt (16 nM) and 
(b) 6B3-all (10 nM) were incubated with different concentrations of myoglobin for 1 hour and 
injected over a myoglobin-coated sensor chip. From the linear sensograms, the slopes 
(resonance units vs. time in sec) were plotted against the corresponding total soluble antigen 
concentration. The slopes correlate to uncomplexed scFv in the injected solutions. Kd was 
calculated from a fit according to Hanes et al (1998). Each point is the average of three 
independent measurements. The example illustrates the invention. 
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In the following examples, all molecular biology experiments are performed according to 
standard protocols (Ausubel et al., 1999). 

Example 1 

Construction of Expression Vectors 

Starting point for all expression vectors were the scFv master genes of the HuCAL library in 
the orientation V H -(Gly 4 Ser)4-V L in the expression vector pBS13 (Knappik et al., 2000), 
which all carried H-CDR3 and L-CDR3 of the antibody hu4D5-8 (Carter et al., 1992). 
The seven isolated human consensus V H domains were PGR. amplified from the master genes 
and the CDR3 region between the BssBIl and Styl restriction sites was then exchanged to 
code for a CDR-H3 found by metabolic selection (J. Burmester et al, unpublished results): 
YNHEADMLIRNWLYSDV. The final expression plasmids were derivatives of the vector 
pAK400 (Krebber et al., 1997), in which the expression cassette of the seven different V H 
domains had been introduced between the Xbal and Hindlll restriction sites, and where the 
skp cassette (Bothmann & Pluckthun, 1998) had been introduced at the Notl restriction site. 
The expression cassette consists of a phoA signal sequence, the short FLAG-tag (DYKD), one 
of the seven Vh domains and a hexahistidine-tag. 

The seven isolated human consensus V L domains were cut out from the master genes with the 
restriction enzymes EcoRV and EcoRl and ligated into a pAK.400 derivative with these 
restriction sites. The L-CDR3 of the Y x domains between the Bbsl and MscI restriction sites 
was exchanged to QSYDSSLSGW (107-138). This X-like L-CDR3 is a consensus L-CDR3 
from sequences found in the Kabat database (Kabat et al., 1991) for V x domains, in contrast to 
the K-like L-CDR3 of hu-4D5-8 with the conserved cfc-proline in position 136. The chosen 
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length of the consensus X-like L-CDR3 is found in 20 % of the sequences, representing the 
highest percentage. The tryptophan at position 109, which is the most frequent residue with 
54 %, was exchanged to tyrosine, which is present in 20 % of the sequences, to avoid 
interference with the native state fluorescence signal of the conserved unique tryptophan. The 
final expression cassette consists of a pelB signal sequence, one of the seven V L domains and 
a hexahistidine-tag. 

The scFv fragments were cloned via the restriction sites Xbal and EcoRl into the expression 
plasmid pMX7. The K-like L-CDR3 was exchanged in the Y x domains as reported above. The 
final expression cassette consists of a phoA signal sequence, the short FLAG-tag (DYKD), 
one of the seven V H domains a (Gly 4 Ser) 4 linker and one of the seven V L domains, the long 
FLAG-tag (DYKDDDD) and a hexahistidine-tag. 

Soluble periplasmic expression 

dYT medium (30 ml containing 30 pg/mL chloramphenicol, 1.0 % glucose) was inoculated 
with a single bacterial colony and incubated overnight at 25 °C. One liter of dYT media (30 
ug/mL chloramphenicol, 50 mM K 2 HP0 4 ) was inoculated with the preculture and incubated 
at 25°C (5 L flask with baffles, 105 rpm). Expression was induced at an OD 550 of 1.0 by 
addition of IPTG to a final concentration of 0.5 mM. Incubation was continued for 18 hours, 
when the cell density reached an OD 550 between 8.0 and 11.0. Cells were collected by 
centrifugation (8000 g, 10 minutes at 4°C), suspended in 40 ml of 50 mM Tris-HCl (pH 7.5) 
and 500 mM NaCl and disrupted by French Press lysis. The crude extract was centrifuged 
(48,000 g, 60 minutes at 4°C), the supernatant passed through a 0.2 pm filter and directly 
applied to IMAC chromatography. 
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Preparative two-column purification 

The proteins were purified using the two column coupled in-line procedure (Pluckthun et al., 
1996). In this strategy, the eluate of an immobilized metal ion affinity chromatography 
(IMAC) column, which exploits the C-terminal His-tag, was directly loaded onto an ion- 
exchange column. Elution from the ion-exchange column was achieved with a 0-800 mM 
NaCl gradient. The V H and V K domains were purified with a HS cation-exchange column in 
10 mM MES (pH 6.0) and the V x domains and the scFv fragments with an HQ anion- 
exchange column in 10 mM Tris-HCl (pH 8.0). Pooled fractions were dialyzed against 50 
mM Na-phosphate, pH 7.0, 100 mM NaCl. 

Insoluble periplasmic expression 

LB medium (30 ml, containing 30 ]j.g / ml chloramphenicol, 1 % glucose) was inoculated with 
a single colony and incubated overnight at 37 °C. One liter of SB medium (10 jig/ml 
chloramphenicol, 0.1 % glucose, 0.4 M sucrose) was inoculated with 10 ml of the preculture 
and incubated at 25°C. Expression was induced at an OD 55 o of 0,8 by addition of IPTG to a 
final concentration of 0.05 mM. Incubation was continued for about 15 hours at 25 °C. After 
centrifugation, cells were suspended in 100 mM Tris-HCl, pH 8.0, 2 mM MgCl 2 and 
disrupted by French Press lysis. Inclusion bodies were isolated following a standard protocol 
(Buchner & Rudolph, 1991). The inclusion body pellet from 1 1 bacterial culture was 
solubilized at room temperature in 10 ml of solubilization buffer (0.2 M Tris-HCl, pH 8.0, 6 
M guanidine hydrochloride (GdnHCl), 10 mM EDTA, 50 mM DTT). The resulting solution 
was centrifuged and the supernatant dialyzed against solubilization buffer without DTT at 
10°C. The sample was loaded on a nitrilotriacetic acid column (Qiagen), which had been 
charged with Ni 2+ , and IMAC under denaturating conditions was performed. The eluate was 
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diluted (1:10) into refolding buffer (0.5 M Tris-HCl, pH 8.5, 0.4 M arginine, 5 mM EDTA, 
20% glycerol, 0.5 mM s-amino-caproic acid, 0,5 mM benzamidinium-HCl) at 16 °C at a final 
protein concentration of 1 p,M. The formation of disulfide bonds was catalyzed either by the 
presence of reduced and oxidized glutathione in the refolding buffer at molar concentrations 
of [GSH] : [GSSG] 0.2 : 1 mM (oxidizing conditions) or 5 : 1 mM (reducing conditions). The 
refolding mixture was incubated at 16 °C for 20 hours and dialyzed against 50 mM Na- 
phosphate, pH 7.0, 100 mM NaCl. 

Ni-NTA batch purification 

Twenty mL of the supernatant of the French press lysis of the scFv fragments was incubated 
with 2 mL of a 50 % Ni-NTA slurry for 30 min at room temperature. The suspension was 
applied on a empty column with a diameter of 1.5 cm and washed extensively with 50 mM 
sodium-phosphate (pH 7.0) and 1 M NaCL To remove unspecific binding proteins, the 
column was washed with 30 mM imidazole. The scFv fragments were eluted by adding 
250 mM imidazole. The purity of the samples was checked by SDS-PAGE analysis and the 
concentration was determined by absorbance at 280 nm. Four scFv fragments were purified in 
parallel with H3k3 always as a control. The yield was normalized to the yield of H3k3 and to 
a 1 L expression culture with an OD 550 of 10. 

Determination of insoluble protein ratio 

An aliquot of a French press lysis extract of a 1 L scFv fragment expression experiment was 
centrifuged at 4 °C for 30 minutes at 16000 g. The supernatant (soluble fraction) and the 
precipitate (insoluble fraction), which was resuspended in 50 mM Tris-HCl (pH 7.5) and 500 
mM NaCl, were analyzed by SDS-PAGE followed by Western Blot with the anti-His 
antibody 3D5 as described (Lindner et al., 1997). Chemiluminiscence was detected using a 
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Chemilmager™ 4400 (Alpha Innotech Corporation) and the density of the bands were 
determined with the software Chemilmager™ 5500 (Alpha Innotech Corporation). As the 
method involves many steps, the error is possibly high, and therefore we give the values as a 
percentage of insoluble material, rounded to tens, with an estimated error of 10%. 

Gel filtration chromatography 

Samples of purified proteins were analyzed on a gel filtration column equilibrated with 
50 mM Na-phosphate, pH 7.0, 500 mM NaCl. The isolated V H domains and the scFv 

fragments at a concentration of 5 jaM were injected on a Superdex-75 column (Pharmacia) 

; i 
i i 

and the isolated V K domains at a concentration of 50 and 5 joM on a Superose-12 column 
(Phannacia) in a volume of 50 \xh and a flow-rate of 60 \xL I min on a SMART-system 
(Pharmacia). The V x domains were injected on a silica based TSK-Gel® G3000SWXL 
column (TosoH) on a HPLC system (HP) in a volume of 50 pL at a concentration of 5 pM 
and a flow rate of 0,5 mL / min. Lysozyme (14 kDa), carbonic anhydrase (29 kDa) and bovine 
serum albumin (66 kDa) were used as molecular standards. Elution was followed by detection 
of the absobance at 280 nm in the case of the SMART-system and at 220 nm in the case of the 
HPLC system. 

Ultracentrifugation 

Sedimentation equilibria were determined with a XL-A analytical ultracentrifuge 
(Beckmann), The samples were dialyzed against 10 mM sodium-phosphate (pH 7.0) and 100 
mM NaCl overnight and loaded into a standard 6 channel 12 mm pathlength cell at a sample 
OD 2S o of 0.4. The fluorocarbon FC43 was added to each cell sector to provide a false bottom. 
The samples were run for 24 h at 20 °C at 19000 rpm. Data were collected at 280 nm at a 
radial spacing of 0.001 cm and a minimum of 10 scans were averaged for each sample. Data 
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were analyzed with software provided by the instrument manufacturer using models that 
assumed either the presence of a single species or of a monomer-dimer equilibrium as 
described previously (Liu et aL, 1998). Solvent densities and sample partial volumes were 
calculated using standard methods. 



Expression and protein purification of V H domains 

The seven HuCAL consensus Vh domains representing the major framework subclasses were 
expressed with the same CDR-H3 to enable the comparison of their biophysical properties. 
First the V H domains were investigated with the CDR3 from the antibody hu4D5-8 
(WGGDGFYAMDY) (Carter et aL, 1992), but the V H domains were insoluble when 
expressed on its own, and only a small inclusion body pellet was obtained. This was not 
surprising, as many if not most V H domains by themselves are insoluble upon periplasmic 
expression (Jager et aL> 2001; Jager & Pluckthun, 1999b; Wirtz & Steipe, 1999), since they 
contain an exposed large hydrophobic interface which is usually covered by V L . However, 
recently three isolated V H domains from the HuCAL (with framework classes V H la, V H lb, 
and V H 3) have been selected in a metabolic selection experiment. These could be expressed in 
the periplasm of E. coli and purified from the soluble fraction of the cell extracts. The main 
feature of the selected V H domains is the length of the CDR3, as all three selected and soluble 
V H fragments contain a longer CDR3. This long CDR3 may cover the hydrophobic interface 
of V H? thereby preventing aggregation. After introducing the CDR3 from one of the selected 
V H 3 domains (YNHEADMLIRNWLYSDV), V H la, V H lb and V H 3 could be expressed in 
soluble form in the periplasm of E, coli and purified from the soluble fraction of the cell 
extracts with a yield of 2 mg/1. 
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In contrast, V H 2, Vh4, Vh5 and V H 6 were still insoluble in the E. coli periplasm. These 
domains were purified from the insoluble fraction with IMAC under denaturating conditions, 
and the eluted fractions were subjected to in vitro refolding. Approximately 1 mg soluble, 
refolded Vh5 domain could be obtained from 1 1 E. coli culture using an oxidizing glutathione 
redox shuffle. Vh2, Vh4 and Vh6 could only be refolded using a redox shuffle with an excess 
of reduced glutathione and yielded about 0.2 mg soluble, refolded protein from 1 1£ coli. 
Vnla, V H lb, V H 3 and V H 5 remained in solution at 4 °C and no degradation was observed. In 
contrast, V H 2, Vh4 and V H 6 have a high tendency to aggregate upon standing at 4°C. 

Therefore, all subsequent experiments were performed with freshly purified proteins. 

\ 1 

Analytical gel filtration 

Samples of purified V H domains were analyzed on a Superdex-75 column equilibrated with 
50 mM Na-phosphate, pH 7.0, 100 mM NaCl, on a SMART-system (Pharmacia). The V H 
domains were injected at a concentration of 2 jjM in a volume of 50 jjlI, and the flow-rate was 
50 jLil/min. Lysozyme (14 kDa), carbonic anhydrase (29 kDa) and bovine serum albumin (66 
kDa) were used as molecular standards. 

To analyze the oligomeric state of the purified domains in solution, analytical gel filtration 
experiments were performed. Vnlb, V H 3, and Vh5 elute at the expected size of a monomer 
(Figure la with Vh3 as an example for monomeric Vh domains). VhI^ elutes under native 
conditions in three peaks that could not be assigned.. We therefore investigated whether small 
amounts of denaturant might break up the aggregates. Using an elution buffer containing 0.5 
M GdnHCl the unassigned peaks decrease and a peak at the size of a monomer showed up. 
With 0.9 M GdnHCl Vnla elutes in a single peak corresponding to a monomer (Figure lb 
with the elution profile of a V H la at 0 and 0.9 M GdnHCl). V H 2, V H 4 and V H 6 did not elute 
from the column under native conditions. Even addition of 1.7 M GdnHCl to the elution 
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buffer did not prevent these domains from sticking to the column. Elution could only be 
achieved with 1 M NaOEL 

Equilibrium denaturation experiments of V H fragments 

Fluorescence spectra were recorded at 25 °C with a PTI Alpha Scan spectrofluorimeter 
(Photon Technologies, Inc., Ontario, Canada). Slit widths of 2 and 5 run were used for 

excitation and emission, respectively. Protein/GdnHCl-mixtures (2 ml) containing a final 

i 

protein concentration of 0.5 \xM and denaturant concentrations ranging from 0 to 5 M 
GdnHCl were prepared from freshly purified protein and a GdnHCl stock solution (7.2 M, in 
50 mM NaP0 4 , pH 7.0, 100 mM NaCl). Each final concentration of GdnHCl was determined 
from its refractive index. After overnight incubation at 10°C, the fluorescence emission 
spectra of the samples were recorded from 320 to 370 nm with an excitation wavelength of 
280 nm. With increasing denaturant concentrations, the maxima of the recorded emission 
spectra shifted from about 342 to 348 nm. The fluorescence emission maximum was 
determined by fitting the fluorescence emission spectrum to a Gaussian function (isolated V H 
domain and scFv fragments), or the fluorescence intensity at 345 nm (isolated V L domains) 
was plotted versus the GdnHCl concentration. Protein stabilities for the isolated human 
consensus V H and V L domains were calculated as described (Jager et al 9 2001). To compare 
Vh, V l and scFv denaturation curves in one plot, relative emission maxima and fluorescence 
intensities were scaled by setting the highest value to 1 and the lowest to 0. 

The thermodynamic stability of the seven human consensus V H domains was examined by 
GdnHCl equilibrium denaturation experiments. Unfolding of the V H domains was monitored 
by the shift of the fluorescence emission maximum as a function of denaturant concentration. 
Figure 2a shows an overlay of the equilibrium denaturation curves of V H la, V H lb, V H 3 and 
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V H 5. In Figure 2b the overlay is normalized to show the fraction of unfolded protein. The 
equilibrium denaturation of these domains is cooperative and reversible, which indicates two- 
state behavior. The V H la domain starts to unfold at 0.9 M GdnHCl, where V H la is 
monomeric in solution as indicated by gel nitration analysis. Therefore, the transition is only 
influenced by the stability of the monomeric V H la domain and not affected by 
multimerization equilibria. For the determination of free energy of unfolding the pretransition 
region of V H la, whose actual slope is influenced by the spectral changes caused by 
dissociation, was assumed to have the same slope and intercept as the V H lb domain. V H 3 
displays the highest change in free energy upon unfolding (AGn-u) with 52.7 kJ mol" 1 and an 
unfolding cooperativity (mu) of 17.6 kJ mol" 1 M' 1 . V H lb is of intermediate stability with a 
AGn-u of 26.0 kJ mof 1 and mu of 12.7 kJ mol" 1 M" \ V H la and V H 5 are less stable and have 
AGn-u values of 13.7 and 19.1 kJ mol" 1 and mu values of 10.1 and 8.6 kJ mol" 1 M" 1 , 
respectively (Table 1). The range of mu values can be compared to that expected for proteins 
of this size (14-15 kDa) and indicate that at least V H la, V H lb, and V H 3 have the cooperativity 
expected for a two-state transition (Myers et ah, 1995). The transition curves of V H 2, V H 4 and 
V H 6 in Figure 2c show poor cooperativity, which indicates that no two-state behavior during 
GdnHCl equilibrium denaturation is followed. As the monomeric state of these V H domains 
could not be ascertained, it is likely that part of this complicated transition involves the 
dissociation of multimers. The broad transition of V H 2 and V H 4 occurred between 1.0 and 2.5 
M GdnHCl with a midpoint of 1.6 and 1.8 M GdnHCl, respectively. V H 6 shows a transition 
between 0.5 and 1.4 M GdnHCl with a midpoint of 0.8 M. This is the lowest midpoint of the 
examined domains, which indicates that V H 6 is the least stable human V H domain. 
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Expression and protein purification of V L fragments 

The four human consensus Vk domains (Vk 1, Vk 2, Vk 3 and Vk 4) carrying the K-like L- 
CDR3 from the antibody hu4D5-8 (sequence: HYTTP (Carter et al., 1992) were expressed in 
soluble form in the periplasm of E. coli. After purification with IMAC followed by a cation 
exchange column the Vk domains could be obtained in high amounts, ranging from 17.1 
mg/L bacteria culture normalized to an OD550 of 10 for Vk3 to 4.5 for VkI (Table 1). 
The K-like L-CDR3 has a conserved cis-proline at position 136 (numbering scheme for 
variable domain residues according to Honegger & Pliickthun, 2001). The amino acid 
sequence of VX domains never show a proline at this position. Therefore, we used for these 
domains a human consensus X-like CDR3 (sequence: YDSSLSGV). The three human 
consensus VX domains (VX1, VX2 and VX3) were also expressed in soluble form in the 
periplasm of E. coli, but the yield after purification with IMAC and anion exchange column 
was much smaller than for the VX domains ranging from 1.9 mg/L bacteria culture 
normalized to an OD550 of 10 for VA2 to 0.3 mg for VXl (Table 1). 

Analytical gel filtration of V L fragments 

While the monomelic V H fragments elute at the expected molecular weight around 13 kDa 
(Figure la), V L domains in 50 mM sodium phosphate (pH 7.0) and 500 mM NaCl interact 
with different column materials. In the case of Vk domains the best results could be obtained 
with a Superose-12 column (Figure lb). At a protein concentration of 50 jjM, Vk3 and Vk2 
elute at a molecular weight of 2 kDa, k4 at 12 kDa and VkI elutes with a broad peak even at 
the total volume of the column. Changing the concentration of Vk4 from 50 to 5 uM the peak 
shifts to a molecular weight of 2 kDa indicating a concentration dependent dimmer - 
monomer equilibrium under the assumption that Vk domains eluting at 2 kDa are monomeric 
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and at 12 kDa are dimeric (see below). Addition of 1 M GdnHCl or suggesting the NaCl 
concentration to 2M did not alter the elution profile. VX domains at concentrations of 5 pM 
show weakest unspecific interaction with silica based TSK columns (Figure lc) and VX1 and 
VX2 elute at a molecular weight of 7 kDa and VX3 elutes at an apparent molecular weight of 
12 kDa. 

To interpret these results from analytical gel filtration, the samples were also analyzed by 
equilibrium ultracentrifugation.The method was used to calibrate the elution values of the 
different columns for V L domains: Vk3 and VX2 give results consistent with a monomer, 
while X3 shows a dimer (shown in Figure Id with Vk3 as an example). Therefore, the V L 
domains: Vk2, Vk3 and VX1 and VX2 eluting at an apparent molecular mass at 6 and 2 kD 
respectively, are indeed monomelic and the V L domains: Vk4 and VX3 eluting at 12 kDa are 
dimeric. VkI, which elutes even at the total volume of the column indicating a strong 
interaction with the column material, behaves in the ultracentrifugation as a monomer (Table 
1). 

Equilibrium transition experiments of V L fragments 

Most V L domains have only one tryptophan (the highly conserved Trp43), which is buried in 
the core in the native state. In GdnHCl denaturation under native conditions no emission 
maxima could be determined, because the fluorescence is fully quenched by the disulfide 
bond Cys23 - Cysl06. During unfolding the tryptophan becomes solvent exposed, giving a 
steep increase in fluorescence intensity. Therefore, the thermodynamic parameters were 
calculated using the 6-parameter fit (Pace & Scholtz, 1997) on the plot of concentration of 
GdnHCl vs. fluorescence intensity, giving curves consistent with two-state behavior. All V L 
domains show reversible unfolding behavior (data not shown). Figure 3(a) and 3(b) show 
relative fluorescence intensity plots against GdnHCl concentration of V K and V x domains. V K 3 
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is the most stable V L domain with a AGk-u of 34.5 kJ mol" 1 , followed by V K 1 with 29.0 kJ 
mol" 1 and V K 2 and Y x l with 24.8 and 23.7 kJ mol" 1 , respectively (Table 1). The least stable V L 
domains are V*2 and V x 3 with a AGn-u of 16.0 and 15.1 kJ mol" 1 . All V L domains show m- 
values between 11.1 and 16.2 kJ moT 1 M" 1 , indicating that they have the cooperativity 
expected for a two-state transition (Myers et al., 1995). The human consensus V K 4 carries an 
exposed tryptophan at position 58 in addition to the conserved Trp43, which is not quenched 
in the native state. The denaturation curve is fully reversible, but shows a steep pre-transition 
baseline followed by a non-cooperative transition. Because of this uncertainly, no AGn-u 
values for V K 4 but only the midpoint of transition are reported, which is at 1.5 M GdnHCL 
For the V K 4 domain Len, a stability of 32 kJ / mol has been reported (Raffen et al., 1999). 

Analysis of primary sequence and model structures 

In the group of isolated V H fragments large differences are seen: V H 3 shows the highest yield 
of soluble protein and thermodynamic stability, V H la, V H lb and V H 5 show intermediate yield 
and intermediate or low stability, while V H 2, V H 4 and V H 6 show more aggregation prone 
behavior and low cooperativity during denaturant-induced unfolding. The properties of V K 
and V x domains are more homogenous. The thermodynamic stabilities differ by only 
approximately 10 kJ / mol in the group of V K and in the group V x domains. In general, the 
stability and soluble yield is higher in isolated V K domains than in V x domains. To analyze 
possible structural reasons for this different behavior of the variable antibody domains, the 
primary sequence and the modeled structures of the seven human consensus V H and V L 
domains were analyzed. The models have been published previously (Knappik et al., 2000) 
(PDB entries: 1DHA (HI a), 1DHO (Hlb), 1DHQ (H2), 1DHU (H3), 1DHV (H4), 1DHW 
(H5), and 1DHZ (H6)) and V L domains (PDB entries: 1DGX (/cl), 1DH4 (k2), 1DH5 (/c3), 
1DH6 (*4), 1DH7 (Xl), 1DH8 (X2), 1DH9 (A3)). The quality of the models varies for the 
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different domains. Many antibody structures in the Protein Data Bank use, for example, the 
V H 3 framework, and the chosen template structure for building the model shares 86 % 
sequence identity excluding the CDR3 region (PDB entry: 1IGM) and the structural 
differences between templates could be traced to distinct sequence differences. In the case of 
V H 6, the closest templates were human V H 4 and murine V H 8 domains, since no crystal 
structure of a member of the V H 6 germline family is available in the PDB. Both germline 
families encode a different framework 1 structural subtype (I) than V H 6 (III) (Honegger & 
Pliickthun, 2001). The chosen template for V H 6 (PDB entry: 7FAB) shares 62 % sequence 
identity, excluding the CDR3 region and belongs to human Vh4. Three questions regarding 
the domains in isolation came up: Why is V H 3 so extraordinarily stable, why do V H 2, V H 4 and 
V H 6 behave comparatively poorly concerning expression and aggregation and why did V K 
domains give higher yields and are more stable than V x domains? 



Salt bridges 

Salt bridges between positively and negatively charged amino acids and repulsions between 
equally charged amino acids play an important role in protein stability (Nakamura, 1996), 
Figure 4a shows a schematic representation of a scFv fragment consisting of V l k3 and V H 3 
domain with its characteristic secondary structure. In Figure 4b positively charged residues of 
at pH 7.0 are shown in gray and negatively charged residues are shown in black. There is an 
accumulation of charged residues at the base of the domain. In V H domains, the conserved 
residues Arg45, Glu53, Arg77, and AsplOO form buried conserved salt bridges connecting 
Arg45 - Glu53, Arg45 - AsplOO, and Arg77 - AsplOO (Figure 5a). At position 77 the V H 5 
consensus is Gin instead of Arg of the consensus of the other subfamilies (Table 2). This 
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change results in loss of the conserved salt bridge connecting Arg77 and AsplOO. In addition, 
charged residues at positions 97 and 99 can be part of the charge cluster. Only V H la, V H lb, 
V H 3, and Vh6 have Glu at position 99. These domains can form additional salt bridges 
between Glu99 - Arg45, as seen in the structure with PDB entry 1IGM or between Glu99 - 
Arg77 as seen in structures with PDB entries 1BJ1 5 1INE, 2FB4 and 1VGE. 
In Vl domains (Figure 5(b)) the amino acid at position 45 is uncharged and the ones in 

position 53 and 97 are either reversed compared to the amino acids at these positions in Vh 

i ■ 

domains or are uncharged. Therefore, the charge cluster contains only one conserved salt 
bridge connecting Arg77 and AsplOO and one main-chain side-chain hydrogen bond 
connecting Glu97 and Arg77 (Figure 5(b)). The least stable V K domain V K 2 carries Leu at 

j ; 

position 45, which is unable to form a side-chain side-chain hydrogen bond to Tyrl04, which 
is conserved in the other V L domains and also in V H domains (Figure 5(a) and (b)). 

Hydrophobic core packing 

Another important stabilizing factor is hydrophobic core packing (Pace, 1990). All model 
structures were checked for cavities, which would indicate improper packing leading to fewer 
van der Waals interactions and reduced thermodynamic stability. A van der Waals contact 
surface was generated for a water radius of 1.4 A with the program Molmol (Koradi et ah, 
1996). When cavities were found, the surrounding residues were checked whether they would 
contribute hydrophobic surface area to the cavity. A cavity lined with hydrophobic residues 
would be less favorable as a water molecule would be energetically unfavorable at such a 
position. Based on these cavities and sequence comparisons between the different variable 
domain frameworks, positions in the hydrophobic core could be identified, which may lead to 
sub-optimal packing. In Figure 4C, an overview of the analyzed core residues is given. The 
core residues are divided into two regions: the upper and lower core according to the 
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orientation shown in Figure 4a. The upper core is build of buried residues above Trp43, the 
conserved disulfide bridge between Cys23, and Cysl06 and Gln/Glu6 towards the CDRs. Part 
of the CDR residues are involved in the upper core with the consequence that different CDRs 
have a strong influence on the upper core (and its contribution to the overall stability) and vice 
versa the residues of the upper core an influence on the conformation of the CDRs (and 
affinity or specificity of antigen binding) (Eigenbrot et aL, 1993). The lower core is below 
Trp43 and its conformation is related to the type of amino acid at position 6, 7, 10 and 78 
(Saul&Poljak, 1993). 

Upper core 

The residues 2, 4, 25, 29, 31, 41, 80, 82, 89, and 108 form the upper core. In the sequence' 
alignment shown in Table 2 these residues have been compared for the variable domains. In 
V H domains two sequence motifs can be distinguished: the VH3-like motif with two bulky 
aromatic residues at positions 29 and 31 (V H lb, V H 3, V H 5), the alternative location of the 
aromatic residues at 25 and 29 (V H 2) and the V H 4/V H 6 motif with Trp at position 41 and a big 
aliphatic residue at position 25. Figure 6(a) shows a superposition of Vr4 on Vh3, 
highlighting the differences between these motifs. In the VH3-like motif Phe29 and Phe31 fill 
the space between the neighboring residues 2, 25, 31 and 108. In the V H 4/V H 6 motif, these 
two residues are changed to smaller residues. Here Trp41 and the methyl group of Val25 fill 
up the empty space. V H la belongs to the V H 3-like motif but has a Gly instead of Phe at 
position 29. No other residue compensates for this empty space, which results in a 
hydrophobic cavity (Figure 6(b)). Vnla, V H lb and Vh5 have an Ala instead of a Leu (Vh3) at 
position 89. There is no obvious compensation for this loss of an isopropyl group. In addition, 
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the substitution of Ala25 (V H 3) to Gly in Vh5 (Table 2) equals the loss of a methyl group, 
further weakening the packing of the upper core of Vh5 (Figure 6(c)). 

Figure 6(d) shows the superposition of the upper core of the V K 3 and V x l domain as 
representatives of V K and \\ domains. The packing density of the V K domains compared to the 
Vh domains is smaller, because there is only one bulky aromatic amino acid in the upper core 
of V K domains at position 89, compared to Vh domains that have at least two aromatic 
residues (Table 2). The packing density is further lowered in V* domains because of the 
smaller Gly in position 25 and Ala in position 89 instead of Ala/Ser and Phe, respectively, 
which are found in V K domains (Figure 6(d), Table 2), consistent with a lower thermodynamic 
stability of domains. 

Lower core 

Within Vh domains an interesting correlation is seen between stability and framework 1 
classification after Honegger and Pluckthun (Honegger & Pluckthun, 2001), which influences 
hydrophobic core packing of the lower core (Saul & Poljak, 1993) and is determined by the 
type of amino acid in positions 6,7 and 10 (Table 3). The most stable V H 3 domain falls into 
subgroup II, while V H la, Vnlb and Vh5 with intermediate properties fall into subgroup III 
(Table 3). The Vh domains showing high inclusion body propensity and no cooperative 
denaturation Vh2, and Vr4 fall into subgroup I. Vh6 is a member of subgroup III because of 
its Gin at position 6 and the absence of Pro in position 7. However, previous experiments 
(Jung et al., 2001) have shown that Pro in position 10 destabilizes the domain. 



Residues 19, 74, 78, 93, and 104 (Table 2) are part of the lower core, which is built of 
residues 13, 19, 21, 45, 55, 74, 77, 78, 91, 93, 96, 100, 102, 104 and 145. Only V H 3, the most 
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stable framework, has a bulky aromatic residue (Phe) at position 78. However, Vnla, Vnlb, 
and Vh5 have Phe at position 74, thereby simply switching the residues in positions 74 and 
78, probably leading to similar interactions (Figure 7(a)). VH5 has an additional exchange at 
position 93 from Met to Trp. This additional aromatic residue in Vh5 could help compensate 
for the loss of Phe78 and the poor interactions in the charge cluster (see above). Apart from 
Tyrl04, no additional aromatic residue stabilizes the lower core of Vh2, Vh4, and Vh6 
(Figure 7(b)). 

In V L domains only one framework 1 subtype is found (Honegger & Pluckthun, 2001), and as 
a consequence, the lower core residues of V K and V x domains are almost the same and have 
similar orientations (Table 2 and Figure 7). 



Residues possibly influencing solubility and folding efficiency 

ji 

Residues that could correlate with poor expression behavior and a high tendency to aggregate 
due to kinetic rather than thermodynamic reasons (Fink, 1998) were further examined. The 
analysis was started from a sequence alignment of the human consensus Vh domains grouped 
by Vh with good biophysical properties (Vnla, Vtjlb, Vh3, Vh5) and more aggregation prone 
V H domains (V H 2, V H 4, V H 6) (Table 3). 

It was shown previously that mutations of exposed hydrophobic residues do not change the 
solubility of the native scFv fragment, as determined by salting-out, but have a profound 
effect on the in vivo folding yield (Nieba et al., 1997). Position 5 is exposed to solvent and 
therefore the hydrophilic residue Gin or Lys of Vh2, Vh4, and Vh6 might be thought to 
decrease the aggregation tendency in contrast to the hydrophobic Val in V H la, Vnlb, Vh3, 
and Vh5. Nevertheless, in a selection experiment favoring stability (Jung et al., 1999), Val 
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was selected out of Val, Gin, Leu, and Glu in the scFv 4D5Flu, possibly indicating the 
importance of local secondary structure propensity. 

Vh2, Vr4 and Vh6 have a non-glycine residue with a conserved positive phi angle at position 
16 (Figure 4(d)), which causes an unfavorable local conformation. Structures that have been 
determined with a non-Gly residue at position 16 (e.g. PDB entries 1C08, 1DQJ, 1F58) 
indeed show that the positive phi angle is locally maintained, apparently enforced by the 
surroundings. In contrast, the odd-numbered V H have all Gly at this position. 
For the antibody McPC603, it has been shown by Knappik & Pluckthun, 1995 that the 
exchange of Pro47 to Ala, adjacent to another Pro at position 48, does not result in better 
thermodynamic stability, but enhances folding efficiency. V H 2 and V H 4 also carry Pro at 
position 47. In V H 6, the highly conserved hydrophobic core residue He is exchanged to Thr at 
position 58, which buries an unsatisfied hydrogen bond donor. 

A proline residue in position H10 can have a strong influence on FR 1 conformation. V H 
structures can be classified into four subtypes with distinct FR 1 conformation and correlated 
differences in the packing of the lower core depending on the type of amino acid found in 
positions H6, H7 and H10 (Honegger & Pluckthun, 2001a). To prove that these residues 
indeed cause the different conformations, Jung et al (2001) introduced different H6/H7/H10 
residue combination into the same V H domain and determined the effect on the structure by 
X-ray crystallography. In their system, all combinations containing Pro in position 10 were 
destabilized compared to molecules containing a Gly, Ala or Ser in this position. While these 
constructs contained Pro in an "unnatural" combination with a V H -domain normally 
containing a different amino acid in this position, and therefore the destabilizing effect could 
also be due to a mismatch between local sequence and overall sequence context, the poorly 
behaved V H 2, V H 4 and V H 6 all contain ProlO, while V H 1B, V H 1B, V H 3 and V H 5 have a Gly 
or Ala in this position. 
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At position 44 the even numbered V H domains carry He in contrast to Val of the odd 
numbered V H domain. This position is located at the interface to V L and should have no effect 
on the isolated domains, but it should have an effect when in complex with V L . 
The exposed CDR 2 residue 60 of the even numbered Vh domains is an aromatic bulky amino 
acid (Trp and Tyr) and probably decreases folding efficiency. This residue cannot be 
exchanged because of possible participation in antigen binding. 

The solvent exposed residue 72 was changed in the antibody McPC603 from a hydrophobic 
residue Ala to Asp, which increases the soluble / insoluble ratio 20-fold but does not alter the 
thermodynamic stability (Knappik et ah, 1995). V H 6 carries a hydrophobic Val at this 
position. 

The odd numbered Vh domains have Gly at position 76 in contrast to the even numbered Vh 
domains, which carry Thr or Ser. In half of the antibody structures determined that are found 
in the PDB the residue at this position has a positive phi angle, indicating that glycine could 
be better at this position. 

The semi-buried position 90 of V H la, V H lb, V H 3, and V H 5 is occupied with Tyr, whereas 
V H 2, V H 4, and V H 6 have Val or Ser. The influence of this substitution on the poor behavior of 
the even numbered domains can only be tested experimentally. 

As the V L domains can be primarily grouped in k and X domains the analysis was 
concentrated on a comparison between these two groups. At the solvent exposed C-terminal 
end at positions 146, 148 and 149 V K domains have charged amino acids in contrast to V x 
domains, which have Thr, Leu and Gly, respectively, at these positions (Table 4, Figure 4(d)). 
In addition, the hydrophilic Thr in position 138 of k domains is exchanged to the hydrophobic 
Val in X domains (Table 4, Figure 4(d)). These exchanges of less hydrophilic residues in V* 
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domains possibly lower the folding efficiency of these domains and may be a contributing 
factor to the smaller soluble yield compared to V K domains. 

Proline is an a-helix and p-strand breaker and thus destabilizes those secondary structures. 
Positions 12 and 18 in Vl domains are both part of a p-sheet structure. Only V K 2 has Pro at 
both positions while Ser and Arg, respectively, are the dominant residues at these positions in 
the other Vl domains (Table 4, Figure 4(d)). 

Expression and protein purification of scFv fragments 

After biophysical characterization of isolated human consensus V H and V L domains 
systematic combinations of V H and V L were also tested to understand their mutual influence 
on biophysical properties and chose the scFv format, in which the Vh domain is linked via a 
flexible peptide linker to the Vl domain. To limit the number of possible Vh - V L 
combinations of 49, the scFv fragments with the most stable V H domain V H 3 was tested 
combined with each of the seven human consensus V L domains and, conversely, the most 
stable Vl domain V K 3 with each of the seven human consensus Vh domains. It should be 
examined if there is a mutual compensation or addition of the individual biophysical 
properties of the isolated variable domains in the scFv fragment or if even synergetic effects 
can occur. 

All V H domains within the scFv fragment carry the same H-CDR3, which is derived from the 
V H domain of the well expressing antibody 4D5 (Knappik et al., 2000; Carter et aL, 1992). 
The V K and V x domains in the scFv fragments carry the k- and X-like L-CDR3, respectively. 
All scFv fragments could be expressed in soluble form in the periplasm and purified with 
IMAC, followed by an anion exchange column. Purity of the fragments was over 98 %, 
confirmed by SDS-PAGE analysis (data not shown) and the subsequent measurements were 
all carried out with freshly purified proteins. To compare the expression yield of the scFv 
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fragments with the different Vh or V L domains, we additionally isolated the scFvs with a 
batch method. To test the error inherent in the yield determination the scFv H3k3 was purified 
4 times independently. The yield of purified H3k3 was 6.5 ± 0.2 nig from a 1 L bacteria 
culture normalized to an OD 55 o of 10 5 which is approximately the final cell density in a 
shaken flask under these conditions. Yields of all scFv fragments tested were normalized to 
the yield of H3k3 and were in the range of 2.6 to 12.4 mg/L (Table 5). Hlaic3 and Hlbic3 
with 11.1 mg / L and 12.4 mg / L, respectively, (1.7 and 1.9 fold the amount of H3k3) 5 show 
the highest yield and H2k3, H4k3 and H6k3 show the lowest yield of scFv fragments with the 
V K 3 domain with 0.6, 0.4 and 0.6 fold that of H3k3, respectively. All scFv fragments with 
V H 3 but different V L domains show yields only below that of H3k3. The percentage of 
insoluble protein was determined for H3k3 in 4 independent measurements to be (30 ±10) %. 
The other scFv fragments tested show a percentage of insoluble protein between 50 % and 10 
% with the exception of H2k3, H4k3 and H6k3, which show a percentage of insoluble protein 
between 80 % and 90 % (Table 5). 



Analytical gel filtration of scFv fragments 

H3k3 elutes from an analytical gel filtration column Superdex~75 at a protein concentration of 
5 jliM in 50 mM sodium phosphate (pH 7.0) and 500 mM NaCl with an apparent molecular 
weight of 29 kDa, which indicates that H3k3 is monomeric in solution. The other scFv 
fragments with Vlk3 as the Vl domain are also monomeric under these conditions, with the 
exception of HlaK3, which shows besides the monomer peak also smaller dimer and multimer 
peaks. H4k3 shows in addition a small amount of dimer of less than 10 %. Figure 8(a) shows 
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the chromatogram of H3k3 as an example for monomelic scFv fragments, along with Hlaic3 
and H4k3. The scFv fragments with V H 3 and a V K domain are all monomelic whereas H3k1 
shows in addition a small dimer peak (Figure 8(b) with H3k3 as an example for monomeric 
scFv fragments and H3k1). In contrast, the scFv fragments with V x domains all show 
monomer - dimer equilibria, with a dimer content from 20 % in the case of H3A1 to 70 % in 
the case of B3X2 (Figure 8(b) with H3A1 as an example for scFv fragments with a Y x 
domain). With 1 M GdnHCl in the elution buffer all those scFv fragments, which had a dimer 
fraction under native conditions, elute in a single peak at an apparent mass of 29 kDa, 
indicating that they are now fully monomeric. The chromatogram in 1 M GdnHCl is shown 
in Figure 8(a) for HlaK3 and in Figure 8(b) for B.3X1 as an example for scFv fragments with 
V k domain. It should be noted that this concentration is below the major transition of all scFv 
fragments. The only exception was H3X2, which still has dimer content of 20 % in 1 m 
GdnHCl. With 2 M GdnHCl, also H3A,2 shows only a monomer peak (data not shown). 

Equilibrium unfolding experiments of scFv fragments 

Unfolding and refolding of the scFv fragments as a function of denaturant concentration was 
monitored by the shift of the maximum of the fluorescence emission after excitation at 280 
nm. Each scFv fragment shows reversible unfolding behavior (data not shown). The 
denaturation of the scFv fragments is usually not a two-state process (Worn & Pluckthun, 
2001), because the scFv fragments are built from two domains, which may have different 
intrinsic stabilities and interact over an interface region and can potentially stabilize each 
other. Therefore, no AGn-u values are reported, but instead the midpoints of the transitions of 
denaturation are given, which are a semi-quantitative measure for the stability of the scFv 
fragments. The assignment of the transitions to V H or V L domain results from the 
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determination of the transition of single domains (Table 1). In Table 5 the midpoints are listed 
for the V H and Vl domain within the scFv fragments. If only one transition is visible, the 
midpoint is assigned to both the V H and V L domain. 

With the knowledge of the denaturation properties of the isolated V H and V L domains and the 
combinations of these domains in the scFv fragments it is now possible to systematically 
study the influence of the interface interaction on the stability of the scFv fragments. Different 
cases can be distinguished (Worn & Pluckthun, 1999): If the stability of the isolated V H and 
Vl domains is very similar, the resulting scFv has also the same stability (see Figure 9(a) with 
H5k3 as an example). If one domain is significantly more stable than the other, the less stable 
one can be stabilized through the interface interaction with the other domain (see Figure 9(b) 
with HlaK3 with the more stable V K 3 stabilizing V H la, and Figure 9(c) with H3k1 with the 
more stable V H 3 stabilizing V K 1). Nevertheless, it is also possible that, although the stability 
of the domains is different, almost no stabilization of the less stable domain occurs (see 
Figure 9(d) with H3k2 as an example). 

The scFv fragments with V*, domains show an interesting behavior (Figure 10(a) with B3X1 
as an example) because the scFv fragments are even more stable than any of the single 
isolated domains. Apparently, the interface interaction between Vh and V L is so strong that 
the domains are stabilized above the intrinsic stability of the isolated domains. If the interface 
finally breaks up, the now isolated domains in the scFv unfold directly, explaining the steep 
transition. This extraordinary behavior strongly depends on the sequence of L-CDR3. 
V x domains were also cloned and purified with the K-like L-CDR3. The isolated V x domains 
with the K-like CDR3 gave very poor yields. They do not show reversible behavior in 
denaturant induced equilibrium denaturation and have lower midpoints of denaturation than 
the- corresponding V x domain with the k-like L-CDR3. The combinations of V H 3 with V x 
domains carrying the K-like CDR3 show similar yield and dimer / monomer ratios in 
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analytical gel filtration as the ones carrying the A,-like CDR3 (data not shown) but a different 
behavior in GdnHCl denaturation. As an example, Figure 10(b) shows H3A1 with a K-like 
L-CDR3, where the V x l domain is only slightly stabilized in comparison to the renaturation 
curve of the isolated VjJ, indicating that the interface stabilization in this case is not so 
strong. It should be noted that the only difference between the two scFv fragments in Figures 
10(a) and (b) is the different L-CDR3, which obviously causes this dramatic stabilization 
difference. The K-like CDR3 with proline in position 136 builds a rigid Q-loop, which 
probably interferes with the perfect orientation between Vr and Vl- 

In summary, the most stable scFv fragments found to denature only starting above 2 M 
GdnHCl are H3k3, Hlbic3, H5k3 and H3k1. Although the isolated V x domains are rather 
unstable by themselves, in combination with Vh3 they can build very stable scFv fragments, 
but depend on the L-CDR3 for this effect. Most likely this CDR is responsible for a favorable 
orientation of Vl to Vh and thus enables a tighter interaction through the interface. ScFv 
fragments with an intermediate stability starting denaturation above 1 M GdnHCl are HlaK3, 
H2k3, H3k2 and H3k4, while H4k3 and H6k3 are scFv fragments with a modest stability, 
starting denaturation under 1 M GdnHCl. 
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Example 2: Structure-based Improvement of the Biophysical Properties of 
Immunoglobulin V H Domains with a Generalizable Approach 

Abbreviations 

CDR, complementary determining region; GdnHCl, guanidine hydrochloride; HuCAL, 
Human Combinatorial Antibody Library; IMAC, immobilized metal ion affinity 
chromatography; IPTG, isopropyl-p-D-thiogalactopyranoside; scFv, single-chain antibody 
fragment consisting of the variable domains of the heavy and of the light chain connected by a 
peptide linker; V H , variable domain of the heavy chain of an antibody; Vl variable domain of 
the light chain of an antibody. 

In a systematic study of V gene families carried out with consensus Vh and Vl domains alone 
and in combinations in scFv fragments, we found comparatively low expression yields and 
lower cooperativity in equilibrium unfolding in antibody fragments containing V H domains of 
human germline families 2, 4 and 6. From an analysis of the packing of the hydrophobic core, 
the completeness of charge clusters, the occurrence of unsatisfied hydrogen bonds, and 
residues with low p-sheet propensity, positive angle and exposed hydrophobic side chains, 
we pinpointed residues potentially responsible for these unsatisfactory properties of these 
germline-encoded sequences. Several of those are in common between the domains of the 
even-numbered subgroups, but do not occur in the odd-numbered ones. In this study, we have 
systematically exchanged those residues alone and in combination in two different scFv 
fragments using the V H 6 framework and we describe their effect on equilibrium stability and 
folding yield. We improved the stability by 20.9 kJ / mol, the expression yield by a factor 4, 
and can now use these data to rationally engineer antibodies derived from this and similar 
germline families for better biophysical properties. Furthermore, we provide an improved 
design for libraries exploiting the significant additional diversity provided by these 
frameworks. Both antibodies studied here completely retain their binding affinity, 
demonstrating that the CDR conformations were not affected. 
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Recombinant antibodies are used in an ever increasing number of applications from biological 
research to therapy. In addition to showing high antigen specificity and affinity, such 
recombinant antibodies should also be obtainable in high yield, have low tendency to 
aggregate and be stable against high denaturant concentrations, elevated temperatures and 
proteases, depending on the requested task. A popular format for many of these applications is 
the single-chain Fv (scFv) fragment, where the variable domain of the heavy chain (Vh) is 
connected via a flexible linker to the variable domain of the light chain (V L ) or vice versa (i- 
5). This format contains the complete antigen binding site and can be expressed in a wide 
range of hosts including bacteria (4) and yeast (5). While we chose to investigate these 
questions with scFv fragments, as their simple structure makes an untangling of domain 
interactions much easier, differences in physical properties are also manifest in Fab fragments 
and whole antibodies, which contain the same domains. 

Mutations important for the biophysical behavior can either influence the equilibrium 
thermodynamic stability or the aggregation tendency during folding or both. While these 
properties are distinguishable and mutations are known (see below) which influence only one 
of these properties, frequently they are related and amino acid exchanges can have an effect 
on both. Mutations influencing thermodynamic stability can make contributions to many 
different types of interactions, such as packing of the hydrophobic core, secondary structure 
propensity, charge interactions, hydrogen bonding, desolvation upon unfolding, compatibility 
with the enforced local structure, and many more (<5, 7). Mutations that influence folding 
efficiency can also be part of this list, as the stability of intermediates is an important 
component. Additionally, however, natural proteins use "negative design" (<?) to avoid 
aggregation. In its simplest form, this avoids hydrophobic patches on the surface. In the case 
of antibodies, such hydrophobic patches were found to have almost no effect on the solubility 
of the native protein, correctly defined as the maximal concentration of the soluble native 



WO 03/008451 PCT/EP02/08094 

55 

protein (9). The hydrophobic patches can have a very dramatic effect on the folding yield and 
thus the yield of functional protein in E. coli, which is colloquially but incorrectly often 
termed "solubility", as the yield describes the overall process of producing soluble protein, 
but not its solubility. 

In the case of scFv fragments, a further complication is introduced by their two-domain 
nature. The two domains can stabilize each other and unfold either cooperatively or with an 
equilibrium intermediate, depending on the relative intrinsic stability of the domains and their 
interface (JO). However, from these studies of domain interactions and a systematic study of 
isolated domains and their interactions (see Example 1, 11), we can now untangle this system. 
We can thus pinpoint the problem spots, and in the present study we wish to provide the 
evidence that a correction of these small defects indeed leads to a marked improvement of 
phenotypes. 

It is thus important to distinguish expression yield from thermodynamic stability. In the 
periplasmic expression of antibodies, the most important limitation of the level of observed 
expression level of functional protein is the periplasmic folding yield (4). Antibodies with 
poor yield of-functional protein give rise to periplasmic aggregates. There are three principal 
mechanisms leading to an increased expression yield of soluble proteins: Increasing the total 
expression level (provided the folding yield stays constant), increasing the folding yield in E. 
coli or decreasing degradation by E. coli proteases. All three mechanisms can be somewhat 
influenced by extrinsic factors including the choice of bacterial strain, expression vector, 
media composition, and expression temperature (summarized in ref. (4)) and coexpression of 
periplasmic chaperones (12, 13). Nevertheless, the major contribution to changes of the 
expression yield of folded protein is due to changes in the protein sequence itself. In the case 
of secreted proteins placed in the same vector, the translation initiation region and the 
beginning of the protein sequence (the signal sequence) is identical between different variants. 
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Therefore, sequence changes are extremely unlikely to influence translation per se. Mutations 
leading to higher thermodynamic stability often also decrease protease digestion of the 
protein, as the E. coli proteases usually prefer unfolded protein as a substrate. Nevertheless, 
mutations removing potential cutting sites for E. coli proteases may also prevent degradation. 
Mutations may thus also influence the efficiency of folding, independent of influencing the 
equilibrium thermodynamic stability of the protein. Side reactions of the folding process often 
lead to aggregated protein, which is enriched in inclusion bodies. The kinetic partioning into 
productive folding and aggregation can be influenced by mutations increasing either the 
thermodynamic stability of intermediates or removing a solvent-exposed hydrophobic residue 
or otherwise making the surface less suitable for aggregate growth ("negative design" (5)). In 
addition, the mutations increasing folding efficiency can also indirectly lead to a higher total 
expression level by preventing the formation of toxic side-products, most likely soluble 
aggregates, which lead to leakiness of the outer membrane and eventually decrease the 
viability of E. coli. 

There are different approaches finding residues that improve the thermodynamic stability and 
yield of soluble protein of scFv fragments (reviewed by Worn & Pluckthun (7)). Previously, 
most work had concentrated on the optimization of individual antibodies. If the three- 
dimensional (3D) structure of the antibody to be improved is known, a detailed analysis can 
identify problematic residues, which can then be exchanged by side-directed mutagenesis {14- 
16). A second approach uses random mutagenesis followed by selection with a bias toward 
the improvement of the desired property {17-19). The consensus approach as a third approach 
{20) uses the sequence information from antibodies naturally encoded by the immune system. 
The genes of innnunoglobulin variable domains, as is assumed for all gene families, have 
diverged by multiple gene duplications and mutations. Selected genes are further subjected to 
an accelerated "local" evolution by somatic mutations that optimize the capacity of the 
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antibody to bind to antigen structures with high affinity, but these mutations are not 
propagated in the germline. In contrast, mutations acquired during the duplication of the 
primordial V gene to make the present-day Ig-locus are manifest as germline family-specific 
differences. In this study, we wanted to explore a generic approach for improving antibodies 
for their biophysical properties combining the above knowledge with our knowledge of the 
biophysical properties of the germline-encoded Vh, V k and families (see Example 1, 11). 
Since we focus on genes with initially germline-encoded sequences, our approach is not 
limited to improving individual molecules and thus to removing changes introduced by 
somatic mutations, but particularly to problematic residues encoded by different germline 
genes. 

Destabilizing mutations may be highly probable but are selectively neutral as long as the 
overall domain stability does not fall below a certain threshold (20). Conversely, random 
mutations resulting in increased thermodynamic stability are highly improbable in the absence 
of a positive selection. Consequently, the most frequent amino acid at any position in an 
alignment of homologous immunoglobulin variable domains should be most favorable for the 
stability of the protein domain. This method was tested on a V K domain and of ten proposed 
mutations six increased the stability. Nevertheless, the simplification inherent in this approach 
is that all frameworks are averaged to a single "ideal" sequence. The different germline genes 
or frameworks have an important function for antibody diversity. First, framework residues in 
the outer loop and close to the 2-fold axis can contribute important interactions to protein- and 
hapten-antigens, respectively. Second, several framework regions can influence the 
conformation of the CDRs and thereby indirectly modulate antigen binding. Third, different 
frameworks carry mutually incompatible residues, which cannot simply be exchanged to 
those of other frameworks. It follows that family-specific solutions are needed to create a 
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variety of different frameworks with superior properties. In this paper we provide the basis for 
this approach. 

Recently, we analyzed the biophysical properties of human germline family-specific 
consensus domains (see Example 1, 11) derived from the Human Combinatorial Antibody 
Library (HuCAL™) (21). In case of the V H domains we found that the V H 3 germline family- 
specific consensus domain was the most stable Vh domain, followed by the Vnla, VHlb and 
V H 5 consensus domains with intermediate stabilities and only little or no aggregation-prone 
behavior. V H 2, V H 4 and Vh6 domains, on the other hand, showed low cooperativity during 
denaturant-induced unfolding, lower yield and a higher tendency to aggregate. The detailed 
analysis of hydrophobic core packing and formation of salt bridges revealed that the Vh3 
domain had always found the optimal solution while all other Vh domains had some 
shortcomings explaining the higher thermodynamic stability of V H 3. Furthermore, with the 
help of a sequence alignment grouped by V H domains with favorable properties (families 1, 3 
and 5) and unfavorable properties (families 2, 4 and 6), residues of the even-numbered V H 
domains were identified and structurally analyzed which potentially decrease the folding 
efficiency being the reason for the unfavorable properties. 

In this study, we used a structure-based approach exploiting the knowledge of the biophysical 
properties of the human germline family-specific consensus V H domains (see Example 1, 11), 
and in addition, resorting to tables of published and in-house selection experiments (A. 
Honegger et al., unpublished) to improve the V H 6 framework as a model. We chose the V H 6 
framework, because it shows a somewhat aggregation-prone behavior and the lowest 
midpoint of denaturation, compared to the other human V H domains, indicating that V H 6 is 
the Vh domain with the lowest thermodynamic stability. These properties were observed with 
isolated domains as well as in the scFv format with V K 3 (see Example 1, 11). We used two 
scFv fragments containing the V H 6 framework which had been selected from the HuCAL 
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(27): 2C2, binding the peptide Ml 8 coupled to transferrin and 6B3, binding myoglobin (see 
Materials and Methods for details). With side-directed mutagenesis and based on our 
structural analysis we introduced six mutations (Q5V, S16G, T58I, V72D, S76G and S90Y) 
alone and in several combinations, which were hypothesized to be independently acting, 
individually exchangeable and were also a feature distinguishing the group of V H families 
with favorable properties from the families with less favorable properties. We compared these 
mutants to the wild-type scFv fragments for effects on folding yield and, independently, the 
free energy of unfolding as a measure for the thermodynamic stability and determined the 
additivity of these mutations. 
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Construction of Expression Vectors 

The scFv fragment 2C2 (A. Hahn et al., MorphoSys AG, unpublished results) with the human 
consensus domains V H 6 and V l k3 (H-CDR3: QRGHYGKGYKGFNS GFFDF and L-CDR3: 
QYYNIPT) was obtained by panning against the peptide Ml 8 with the sequence 
CDAFRSEKSRQELNTIASKPPRDHVF coupled to transferrin (Jerini GmbH, Berlin), while 
the scFv fragment 6B3 (S. Mtiller et ah, MorphoSys AG, unpublished results) with V H 6 and 
V L X3 (H-CDR3 : SYFISFFSFDY and L-CDR3: SYDSGFSTV) was obtained by panning * 
against myoglobin from horse skeletal muscle (Sigma). Both scFv fragments were subcloned 
via the restriction sites Xbal and EcoRL into the expression plasmid pMX7 (21). The different 
mutations were introduced with the QuikChange™ site-directed mutagenesis kit from 
Stratagene according to the manufacturers instructions. Multiple mutations were constructed 
by exchanging restriction fragments using unique Xbal, Xhol, BsaSl and EcdRI sites in the 
antibody. The final expression cassettes consist of a phoA signal sequence, short FLAG-tag 
(DYKD), the scFv fragment in the orientation V H 6 domain - (Gly 4 Ser) 4 linker - V L domain, 
followed by long FLAG-tag (DYKDDDD) and a hexahistidine-tag. 

Expression and purification 

Thirty mL dYT medium (containing 30 jiig/mL chloramphenicol, 1.0% glucose) was 
inoculated with a single bacterial colony and shaken overnight at 25°C. One liter of dYT 
medium (containing 30 fxg / mL chloramphenicol, 50 mM K2HPO4) was inoculated with this 
preculture and incubated at 25°C (5 L flask with baffles, 105 rpm). Expression was induced at 
an OD550 of 1.0 by addition of IPTG to a final concentration of 0.5 mM. Incubation was 
continued for 18 hours while the cell density reached an OD550 between 8.0 and 11.0. Cells 
were collected by centrifugation (8000 g, 10 min at 4°C), resuspended in 40 ml of 50 mM 
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Tris-HCl (pH 7.5) and 500 mM NaCl and disrupted by French Press lysis. The crude extract 
was centrifuged (48,000 g, 60 minutes at 4°C) and the supernatant passed through a 0.2 pm 
filter. The proteins were purified using the two column coupled in-line procedure (4). In this 
strategy, the eluate of an immobilized metal ion affinity chromatography (LMAC) column, 
which exploits the C-terminal His-tag, was directly loaded onto an ion-exchange column. 
Elution from the ion-exchange column was achieved with a 0-800 mM NaCl gradient. The 
constructs derived from the scFv 2C2 were purified with a HS cation-exchange column in 10 
mM MES (pH 6.0) and those derived from 6B3 with an HQ anion-exchange column in 10 
mM Tris-HCl (pH 8.0). Pooled fractions were dialyzed against 50 mM Na-phosphate, pH 7.0, 
100 mM NaCl. Protein concentrations were determined by OD 2 so. The soluble yield was 
normalized to a one liter bacterial culture with an OD 550 of 10. 

Gel filtration chromatography 

Samples of purified scFv fragments were analyzed on a Superdex-75 column equilibrated 

with 50 rnM Na-phosphate, pH 7.0, 500 mM NaCl, on a SMART-system (Pharmacia). The 

i ' r i 

samples were injected at a concentration of 5 \xM in a volume of 50 pi, and the flow-rate was 
60 pl/min. Lysozyme (14 kDa), carbonic anhydrase (29 kDa) and bovine serum albumin (66 
kDa) were used as molecular weight standards. 

Equilibrium denaturation experiments 

Fluorescence spectra were recorded at 25 °C with a PTI Alpha Scan spectrofluorimeter 
(Photon Technologies, Inc., Ontario, Canada). Slit widths of 2 nm were used both for 
excitation and emission. Protein/GdnHCl-mixtures (1.6 ml) containing a final protein 
concentration of 0.5 pM and denaturant concentrations ranging from 0 to 5 M GdnHCl were 
prepared from freshly purified protein and a GdnHCl stock solution (8 M, in 50 mM Na~ 
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phosphate, pH 7.0, 100 mM NaCl). Each final concentration of GdnHCl was determined by 
measuring the refractive index. After overnight incubation at 10°C, the fluorescence emission 
spectra of the samples were recorded from 320 to 370 run with an excitation wavelength of 
280 nm. With increasing denaturant concentrations, the maxima of the recorded emission 
spectra shifted from about 340 to 350 nm. The fluorescence emission maximum was 
determined by fitting the fluorescence emission spectrum to a Gaussian function and was 
plotted versus the GdnHCl concentration. Protein stabilities were calculated as described 
(22,23). To compare scFv denaturation curves in one plot the emission maxima were scaled 
by setting the highest value to 1 and the lowest to 0 to give normalized emission maxima. 

Enzyme linked immunosorbent assay (ELISA) 

Myoglobin from horse skeletal muscle (Sigma) and peptide Ml 8 coupled to transferrin (Jerini 
GmbH, Berlin) at a concentration of 5 jLig/ml in 50 mM Na-phosphate, 100 mM NaCl, pH 7.0 
were coated overnight at 4°C on Maxisorb 96-well plates (Nunc). Plates were blocked in 
2.0% sucrose, 0.1 % bovine serum albumin (Sigma), 0.9 % NaCl for 2 h at room 
temperature. After incubation of samples at concentrations from 2 \xM to 0.125 juM, bound 
scFv fragments were detected using an oc-tetra-his antibody (Qiagen) followed by an anti- 
mouse antibody conjugated with alkaline phosphatase (Sigma). 

BIAcore measurements 

BIAcore analysis was performed using a CM5-chip (Amersham Pharmacia) with one lane 
coated with 2,700 resonance units (RU) of myoglobin from horse skeletal muscle (Sigma), 
one coated with 2,500 RU peptide Ml 8 coupled to transferrin (Jerini GmbH, Berlin) and one 
blank lane as a control surface. Each binding-regeneration circle was performed at 25 °C with 
a constant flow rate of 25 \xL I min with different antibody concentrations ranging from 5 juM 
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to 0.08 j^M in 20 rnM HEPES (pH 7.0) ? 150 niM NaCl and 0.005 % Tween 20 and 2 M 
NaSCN for regeneration. Determination of the antigen dissociation constant in solution was 
performed with competition BIAcore {24,25) with the same chip, buffer and regeneration 
conditions. ScFv fragments at constant concentration and variable amounts of antigen were 
preincubated at least for one hour at 10°C and injected in a sample volume of 100 \xL. Data 
were evaluated by using BIAevaluation software (Pharmacia) and SigmaPlot (SPSS Inc.). 
Slopes of the association phase of linear sensograms were plotted against the corresponding 
total antigen concentrations and the dissociation constant was calculated as described 
previously (26), 
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Properties of the wild type scFv fragments 

We chose the V H 6 framework as the model system to test our strategy for improving the 
biophysical properties by a structure-based design and used two scFv fragments selected from 
the HuCAL as model systems: 2C2, which binds the peptide Ml 8 coupled to transferrin, and 
consists of V H 6 paired with V K 3, and 6B3, which binds myoglobin, consisting of V H 6 paired 
with Vx3. The two antibodies differ in CDR3 (see Materials and Methods), but otherwise the 
V H 6 sequence is identical. The wild-type (wt) scFv fragments 2C2 and 6B3 were expressed in 
the periplasm of E. colu The scFv fragments were purified from the soluble fraction of the 
cell extract by immobilized metal affinity chromatography (IMAC), followed by an ion- 
exchange column. The purity of the scFv fragments was greater than 98 %, as determined by 
SDS-PAGE (data not shown). The soluble yield after purification of a one liter bacterial 
culture normalized to OD550 of 10 of 2C2-wt and 6B3-wt was 1.2 ± 0.1 mg and 0.4 ± 0.1 mg, 
respectively. Approximately 10 % and 25 %, respectively, of the total amount of expressed 
protein was found in insoluble form, as determined by Western Blot. The oligomeric state was 
determined by analytical gel filtration. Both proteins elute with an apparent molecular weight 
of 29 kDa, indicating that they are monomelic (Figure 1 1). The thermodynamic stability of 
each protein was measured by equilibrium GdhHCl denaturation. Unfolding of the scFv 
fragments was monitored by the shift of the fluorescence emission maximum as a function of 
denaturant concentration. Figure 12(a) shows the denaturation curve of 2C2-wt and 6B3-wt 
Both curves show only one transition, indicating that V H and V L within the scFv fragment 
denature simultaneously (10). Since the fluorescence intensity of the folded and unfolded 
state is similar, and the maximum changes by only 17 nm, the shift in maximum can be used 
to determine the population of unfolded molecules (27). Under the assumption that the 
unfolding of the scFv fragments is a two-state process, the free energy of unfolding AGn-u can 
be determined (28,29). 2C2-wt showed a AG^u of 51.3 kJ / mol and 6B3-wt a AGn-u of 51.3 
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kJ / mol with m-values of 25.2 kJ mol" 1 M" 1 and 27.4 kJ mol" 1 M" 1 . These m-values lie in the 
expected range for proteins of this size indicating that both scFv fragments have the 
cooperativity expected for a two-state process {30), 



Structural rationale for the selection of mutations 

The first set of mutants to improve the properties of scFv fragments 2C2 and 6B3 containing 
the human V H 6 framework was chosen from the analysis of the structural model, guided by 
the sequence alignment of the human consensus V H domains grouped by V H domains with 
favorable biophysical properties (families 1, 3 and 5) and Vh domains with less favorable 
properties (families 2, 4 and 6) (Figure 13). We focused on residues of the framework and 
excluded the CDR regions, since we aim to identify generically applicable mutations unlikely 
to affect antigen binding. The residues that we investigated in 2C2 and 6B3, together with the 
reasoning behind the specific changes are the following: 

Q5 V: In a selection experiment of the scFv 4D5Flu favoring stability, Val was selected at this 
position out of Val, Gin, Leu, and Glu (18). Position 5 is part of the first p-straiid and Val has 
a higher p-sheet propensity as Gin (31). Nevertheless, it was shown previously that mutations 
of exposed hydrophobic residues have a profound effect on the in vivo folding yield (9). 
Figure 14 shows that Gin in position 5 of the model of a V h 6-V l k3 scFv fragment (21) (PDB 
entries: 1DHZ (V H 6) and 1DH5 (V l k3)) is exposed to solvent and therefore the hydrophilic 
residue Gin or Lys of V H 2, V H 4 and V H 6 might be thought to enhance folding efficiency in 
contrast to the hydrophobic Val in Vala, V H lb, V H 3, and Vr5. In summary, this mutation 
increases p-sheet propensity at the expense of creating an exposed hydrophobic residue. 
S16G: Vh2, V h 4 and Vr6 carry a non-glycine residue, nevertheless with a conserved positive 
phi angle at position 16 in the loop of framework 1 (Figure 14), which probably causes an 
unfavorable local conformation. Structures that have been determined with a non-Gly residue 
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at position 16 (e.g. PDB entries 1C08, 1DQJ, 1F58) indeed show that the positive phi angle is 
locally maintained, apparently enforced by the surroundings. In contrast, the odd-numbered 
V H all have Gly at this position. 

T58I: The residue at position 58, which is the highly conserved He, points into the 
hydrophobic core (Figure 14). Only V H 6 has Thr at this position burying an unsatisfied 
hydrogen bond donor. Therefore, this residue was changed to lie. 

V72D: The solvent exposed residue 72 (Figure 14) was changed in the antibody McPC603 
from Ala to Asp, which increased the ratio of protein found in the soluble periplasmic fraction 
compared to the insoluble periplasmic fraction 20-fold, but did not measurably alter the 
thermodynamic stability (15), indicating hat it might have an effect on the folding efficiency. 
Only the consensus sequence of the most stable V H family V H 3 has Asp at this position. 
S76G: The odd numbered V H domains have Gly at position 76 in framework 2 (Figure 14) in 

I ! 
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contrast to the even numbered V H domains, which carry Thr or Ser. In half of the known 
antibody structures found in the PDB, the residue at this position has a positive phi angle, 
indicating that glycine could be a better choice at this position. 

S90Y: The semi-buried position 90 (Figure 14) of V H la, V H lb, V H 3, and V H 5 is occupied by 
Tyr, whereas V H 2, V H 4, and V H 6 have Val or Ser. This residue is part of the p-sheet of the 
immunoglobulin fold and is exchanged to Ser in V H 6, but Tyr has a higher p-sheet propensity 
than Ser (31). 

hi position 20 and 88 group-specific differences are seen, too (Figure 13). The residues in 
both positions are solvent exposed and participate in a p-sheet. At position 20 the odd- 
numbered V H domains have the basic residues Lys and Arg, while the even-numbered 
domains show Thr or Ser. In position 88 all domains with favorable properties contain Thr 
and the domains with unfavorable properties contain Gin. However, as all theses residues are 
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hydrophilic and have similar 0-sheet propensities, it might be expected that the differences in 
folding efficiency is small. Therefore, these residues were not exchanged. 

Single mutations 

The six mutations (Q6V, S16G, T58I, V72D, S76G agmd S90Y) described above were 
introduced into 2C2-wt and 6B3-wt by site directed mutagenesis. All scFv fragments carrying 
one mutation were expressed and purified in an identical manner to the wild type scFv 
fragments and were monomelic in solution (data not shown). In all single and subsequently 
constructed multiple mutants the proportion of soluble to insoluble protein in the periplasm 
stayed constant, even in those cases where the total expression level increased. The 
biophysical data are summarized in Table 7 To compare the improvements caused by the 
mutations in 2C2 and 6B3, the expression yield of soluble protein is normalized to the yield 
of the corresponding wild-type scFv fragments and the free energy of unfolding (AGn-u) is 
given as the difference (AAGn_u) to the corresponding scFv-wt. The denaturant-induced 
unfolding curves are shown in Figure 12(b). 

Both single mutations exchanging the non-gycine residues with positive phi-angles (S16G 
and S76G) increased the yield of soluble protein by a factor of approximately two. The 
thermodynamic stability was also increased in both single mutations with AAGn-u of 6.2 and 
7.3 kJ / mol for 2C2-S16G and 6B3-S16G and AAGn-u of 3.7 and 3.5 kJ / mol for 2C2-S76G 
and 6B3-S76G, respectively, compared to the wild-type scFv fragments. The mutation to Gly 
in a loop region causes a higher flexibility, which enables the optimal orientation of the anti- 
parallel p-sheet stabilizing the whole domain. The higher yield of these mutants is probably 
due to the increased protease resistance and folding efficiency caused by the stabilized folded 
state of the protein. 
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The mutation of the OH-carrying Thr58 to lie, pointing into the hydrophobic core, did not 
alter the yield of soluble protein but caused a marked increase of thermodynamic stability 
with AAGn-u of 7.9 and 6.8 kJ / mol for 2C2-T58I and 6B3-T58I, respectively. This 
remarkable improvement in stability is due to the additional van der Waals interaction of the 
hydrophobic He within the hydrophobic core and to the absence of the desolvation necessary 
when burying Thr. Interestingly, this mutation does not have an effect on the yield of soluble 
protein, indicating that the folding efficiency is not increased. 

Both mutations exchanging a residue in aTp-sheet to a residue with higher p-sheet propensity 
(Q5V and S90Y) resulted in an approximately 1.8-fold increase in yield of soluble protein. In 
addition, the thermodynamic stability is slightly increased with the exception of 2C2-S90Y, 
which shows even a very small decrease in comparison to the wild-type scFv fragment. The 
analysis of these constructs shows that mutations of residues, which participate in a p-sheet, to 
a residue with higher p-sheet building propensity can increase yield of soluble protein due to a 
higher folding efficiency. Depending on the scFv fragment the thermodynamic stability is 
also increased probably because of better orientation of the mutated residue, facilitating the 
orientation of stabilizing hydrogen bonds in the p-sheet. 

The last single mutation exchanges a solvent-exposed hydrophobic residue with a hydrophilic 
one (V72D). The yield of soluble protein in 2C2-V72D and 6B3-V72D is increased 3.2 and 
1.8 fold, respectively. The thermodynamic stability in 2C2-V72D is not changed, while in 
6B3-V72D it is slightly increased with AAGn-u of 2.2 kl / mol. 

Multiple mutations 

To determine whether the improvements were additive, we cloned combinations of the single 
mutations. The scFv fragments with multiple mutations were expressed and purified as above 
and were also monomeric in solution, as demonstrated by analytical gel filtration (2C2- and 
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6B3-all as examples in Figure 11). The denaturation curves of all multiple mutants of 2C2 
tested showed one steep, cooperative transition (Figure 12(d)), indicating that the V K 3 domain 
is also stabilized with the help of the six mutations in V H 6, probably because the mutated V H 6 
domain stabilizes V K 3 through the hydrophobic V H - V L interface interactions. In contrast, the 
transition of the equilibrium unfolding of the double mutants 6B3-Q5V+S16G and 6B3- 
T58I+S76G revealed a lower cooperativity compared to 6B3-wt and gave m-values of 18.9 
and 19.3 kJ mol' 1 MT 1 , respectively, indicating that the unfolding is no longer a two-state 
process. The scFv fragment 6B3 carrying all six mutations derived from the sequence 

comparison with the group of V H domains with favorable properties (6B3-all) showed an even 
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lower cooperativity and has an m-value of 14.3 kJ mol" 1 M" 1 (Figure 12(a)). The V x 3 domain, 
which has the lowest thermodynamic stability of isolated V L domains (see Example 1, 11), 
probably starts to unfold first in the scFv 6B3 with multiple mutations, while the mutated,- 
stabilized V H 6 domain is still folded and only unfolds at higher concentrations of denaturant. 
Because of this lack of 2-state behavior, the AGn_u values could not be calculated for the 
multiple mutants of 6B3. 

The details of the yield of soluble protein and thermodynamic stability determinations are 
listed in Table 7. In summary, the effect on yield and stability of the single mutations is 
almost fully additive. The scFv fragments carrying all six mutations, 2C2-all and 6B3-all, 
show an increase in yield of 4.3 and 4.2 fold, respectively, compared to the wild-type scFv 
fragments. The absolute values for 2C2-all are a yield of 5.1 mg / L, which is 3.9 mg / L more 
than for 2C2-wt, and a thermodynamic stability of 72.3 kJ / mol. In the case of 6B3-all, a 
yield of 1.7 mg / L was obtained, which is 1.3 mg / L more than for 6B3-wt. 
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Analysis of framework 1 subtype 

Vh structures can be divided into four distinct framework 1 conformations depending on the 
type of amino acids at position 6, 7 and 10 (32) (numbering scheme is according to Honegger 
& Pluckthun (33)). Residues at position 19, 74, 78 and 93, which are part of the hydrophobic 
core of the lower part of the domain and thus influence thermodynamic stability and folding 
efficiency, are, correlated to this structural subtype (32). While the V H domains with the most 
favorable properties fall into subtype II (V H 3) and subtype III (V H la, V H lb and V H 5), the V H 
domains with less favorable properties Vh2 and Vh4 fall into subgroup L Vh6, which we want 
to improve, can be assigned to subtype III which is defined by Gin at position 6 and the 
absence of Pro at position 7 (32). Analysis of subtype III defining and correlated residues of 
human V H domains (32) shows that the V H 6 fragment carries rarely used residues in position 
10, 74 and 78 (Table 8). Pro in position 10 is used in 8 % of the sequences, whereas Ala is 
used in 76 % of the sequences. Pro only allows a more limited number of conformations than 
Ala. In a mutagenesis experiment (34), Pro at position 10 was shown to destabilize a Vh 
domain in a subtype IV context (only occurring in murine, not in human sequences). Val at 
position 74 and He at position 78 have a frequency of 1 % and 8 %, respectively, compared to 
V H subtype III sequences. Val74 was exchanged in 2C2 and 6B3 to the more frequently found 
Phe, as the bulky aromatic amino acid probably increases the packing density of the 
hydrophobic core. Ile78 was not exchanged to the subtype III consensus residues Ala or Val, 
which are, as He, non-aromatic aliphatic residues, as the effect on the packing density would 
probably be small. In Figure 15(a) the framework 1 subtype determining and correlated 
residues are shown in the model of V H 6 (21) (PDB entry: 1DHZ), and in Figure 15(b) the 
model of the double mutation is shown with PI OA (Pro to Ala at position 10) and V74F. 
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The mutations to the framework ! subtype in consensus PI OA alone and in combination with 
V74F were introduced into the wild-type scFv fragments by site directed mutagenesis. 2C2- 
P10A and 6B3-P10A showed a 2.9 and 4.2 fold increase in yield of soluble protein compared 
to the wild-type scFv fragments, respectively, while the double mutants with PI OA and V74F 
showed a lower increase with 1.9 and 1.7 fold, respectively. All biophysical data are 
summarized in Table 7. The analysis of the soluble and insoluble fraction of the periplasmic 
expression in E. coli of the single- and double-mutant showed that both the total expression 
level and the level of soluble protein increased by the mutations and thus the ratio between 
soluble and insoluble scFv fragment remained constant (data not shown). The thermodynamic 
stability of the scFv fragments 2C2 and 6B3 is not increased by the mutation PI 0A ? and is 
only slightly increased (AAGn-u of 0.5 kJ / mol and 0.4 kJ / mol, respectively) with the 
double-mutation PI OA and V74F (Table 7, Figure 12(d)). The biophysical analysis therefore 
shows that the mutation PI OA indeed increases the folding efficiency, as demonstrated by the 
higher yield of periplasmic protein but did not change stability in comparison to the wild-type 
scFv fragments. In contrast, the mutation V74F may slightly increase the stability because of 
enhanced stabilizing interactions in the hydrophobic core, probably at the expense of folding 
efficiency, since the positive effect of PI OA on yield is decreased in the double-mutant. 
Because of the higher yield of the single-mutant PI OA compared to the double-mutant 
P10A+V74F, which showed only a small increase in thermodynamic stability, we cloned only 
the mutation PI OA into 2C2-all and 6B3-all, resulting in the construct scFv-all+PlOA. The 
yields compared to 2C2-all and 6B3-all were decreased 0.8 and 2.1 fold, respectively. In the 
case of 2C2-all+P10A the thermodynamic stability with AGk-u of 68.1 kJ / mol was 4.1 kJ / 
mol lower than the stability of 2C2-all. The midpoint of denaturation, which is a semi- 
quantitative measure for the thermodynamic stability, in 6B3-all+P10A was also at lower 
GdnHCl concentration than the midpoint of 6B3-all. 
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Determination of binding activity 

The goal of the study was to show that yield and stability of V H 6 containing scFv fragments 
can be improved by the structure-based approach, guided by the family-specific analysis, 
while the binding activity is retained. We analyzed the binding activity with two independent 
methods: ELISA and BIAcore. For the ELISA, we coated the corresponding antigen and 
applied various concentrations of scFv fragments. We tested all single mutations including 
scFv-PlOA and the multiple mutations scFv-all and scFv-all+PlOA. All mutants show similar 
concentration dependence, which indicates that they have the same binding affinity (data not 
shown). 

BIAcore experiments were performed with different concentrations of scFv fragments 
flowing over an antigen-coated chip. Figures 16a and 16b show an overlay of 2C2-wt and -all 
and 6B3-wt and -all, respectively, plotted as resonance units (RU) vs. time. The association 
and dissociation curves of scFv-wt and -all to the antigen-coated chip superpose in both cases, 
indicating that the binding is fully retained. However, the dissociation phase did not reach the 
background level before injection of scFv fragments, preventing unambiguous determination 
of the antigen dissociation constant (Kd). This unspecific binding was observed at different 
antigen-coating densities (2,700 RU and 370 RU, data not shown). This indicates that this 
behavior is not due to rebinding on the chip but maybe due to a small portion partially 
unfolded scFv fragment that sticks nonspecifically to the antigen-coated chip. Therefore, 
competition BIAcore experiments (24,25) were performed to determine Kd in solution. In this 
experiment, scFv protein was incubated with soluble antigen, and the mixture was injected on 
a BIAcore chip containing immobilized antigen. Only free scFv, but not antigen-bound scFv, 
could bind to antigen on the surface. Thereby, the dissociation constant in solution can e 
determined, independent of any unspecific binding events. From the previous experiments Kd 
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was estimated to be around 10" 7 M. Therefore, competition BIAcore experiments were 
performed with 6B3-wt and 6B3-all at 16 nM and 10 nM, respectively, in the presence of 
different concentrations of myoglobin ranging from 50 nM to 30 jjM. From a plot of the slope 
of the association phase against the corresponding total antigen concentration in solution, IQ 
of 6B3-wt was calculated as (1.9 ± 0.5) • 10~ 7 M and that of 6B3-all as (1.5 ± 0,4) ■ 10~ 7 M as 
described previously (26) (Figure 17). Both Kd values lie in the experimental error range 
indicating that the binding is fully retained. 

The aim of this study was to demonstrate the validity of the structure-based, family-consensus 
based predictions. We chose scFv fragments containing the human germline family Vh6 
consensus domain as a model system to improve the expression yield of soluble protein and 
thermodynamic stability. Potential mutations improving these biophysical properties were 
identified from comparison of the residues which define the framework 1 subtype and other 
interacting residues to the consensus found within the same subtype. The next set of potential 
mutations was found by an analysis of the structure for potential imperfections, guided by a 
comparison to the consensus sequences of those V H domains with known favorable 
biophysical properties (families 1, 3 and 5). We excluded CDR residues from this analysis. 
We could pinpoint such residues, as we had previously systematically determined the 
biophysical properties of consensus sequences of all human variable domain subgroups (see 
Example 1, 11). The experiment shows that all seven proposed single mutations fall into three 
categories. They result either only in an increase in expression yield of soluble protein, or 
only in thermodynamic stability, or both. This distinction helps to understand the role of these 
residues in determining the biophysical properties of this proteins. Ih case of the scFv 2C2 
three and in case of the scFv 6B3 even five out of these seven mutations result in an 
improvement of both biophysical properties. These results illustrate that the combination of 
structure-based analysis, guided by family alignments, is a powerful way to improve the 
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properties of immunoglobulin variable domains. Since our analysis (see Example 1, 11) 
covers all human families, we have now a general strategy for this task. 

The analysis of different combinations of the single mutations to the consensus of V H 
domains with favorable properties showed that the improvements in free energy were almost 
perfectly additive, indicating that they act independently. The mutant with the highest yield 
and thermodynamic stability compared to the wild-type scFv fragments is indeed the mutant 
with all six mutations. In the case of the scFv 2C2, the properties of the best mutant are 
comparable to the properties of a model scFv fragment consisting of the most stable V H 

domain, V H 3, and the same V L domain V K 3 with a different CDR3, which was part of the 
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systematic biophysical characterization of human variable antibody domains (see Example 1, 
11), indicating that it is indeed possible to turn an antibody with unfavorable properties into a 
one with very favorable properties by changing only a few residues. Most importantly, both 
CDRs and those framework residues are maintained which are important for binding. 
The addition of the mutation PI OA to the scFv fragments carrying six mutations decreases 
both expression yield and thermodynamic stability, although in the wild-type scFv fragments 
this mutation increased the soluble yield 2.9-fold in the case of 2C2-P10A and 4.2-fold in the 
case of 6B3-P10A and left the thermodynamic stability unchanged. The mutations Q5V and 
S16G, which are close to position 10, should still be beneficial to the V H 6 framework as they 
are independent of the type of amino acid in position 10. The reason of the declined 
biophysical properties of this mutation in the context of the improved framework can 
probably only be explained with the help of the experimentally determined 3D structure. 
The improvements seem to be independent of the Vl domain and of the sequence and length 
of CDR3, as 2C2 with V K 3 and 6B3 with V x 3 and different H-CDR3 loops gave similar 
results. There were only two minor exceptions, as the thermodynamic stability of the 6B3 
mutants V72D and S90Y is slightly increased, while in 2C2 no stability increase could be 
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observed. It was shown previously that in scFv fragments V x domains, in contrast to V K 
domains, are able to form very stable V H - V L interfaces, increasing the stability of the whole 
scFv fragment even above the intrinsic stabilities of the isolated domains (see Example 1, 11). 
The residue at position 72 is not involved in the interface interactions but is in close proximity 
to it (Figure 14). It is therefore possible that the mutation V72D may lead to a small change in 
the orientation of the interface, which has no effect on V K 3 domains in 2C2 but a small 
stabilizing effect through the interface interactions with the V\3 domain of 6B3. The residue 
in position 90 is on the side opposite to the interface to V L (Figure 14) and also 29 residues 
away from the CDR3 indicating that the slightly increased stability of 6B3 is probably not due 

; ' i 
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to the different V L domain and CDR3 sequences compared to 2C2. 

Although we did not exchange residues of the CDR with possible direct contact to the 
antigen, it could not be a priori excluded that changes in the framework might affect the 
orientation of the CDRs and, thereby, antigen binding. Therefore, we experimentally 
determined the binding properties. However, in the case of the examined mutations, antigen 
binding was fully retained as demonstrated by three independent methods. 
In this study we show that it is possible to rationally transform antibody frameworks with less 
favorable properties into those with very favorable properties while retaining their binding 
activity and the binding characteristics of the framework. It could be argued that an easier 
approach would be to use directly the very stable Vh3 framework with a suitable Vl domain. 
Nevertheless, framework residues can affect the orientation of CDRs, can be part of the 
hapten-binding cavity located in the V H - Vl interface and build the "outer loop 55 , which was 
seen in some cases to be involved in antigen binding. These "framework" residues can 
thereby contribute greatly to affinity and diversity and it is unlikely that a single framework 
can provide the ideal solution in all cases. Therefore, we believe that the preferred approach to 
achieve a structurally diverse library of stable frameworks is to optimize the human consensus 
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antibody frameworks further in the way we presented here, as it would give access to a whole 
range of stable scaffolds covering all natural families. 

In this study we focused on the improvement of the V H 6 framework. However, because of the 
sequence similarity five of the mutations studied (Q5V, S16G, V72D, S76G and S90Y) 
should give similar results for V H domains belonging to family V H 2 and V H 4. While this 
approach is useful for the design of antibody libraries, in many cases given human antibodies, 
e.g. from transgenic mice (35,36), obtained by humanization (37) or by phage display from a 
library of natural sequences (38-40) may also benefit from improvement. 
These results also show that some human germline genes do not encode an optimal version of 
the protein, regarding its biophysical properties. Since the biophysical properties of natural 
domains cover a wide range, it cannot be argued that limited stability is a desirable property 
for the immune system. Rather, the stability of V H 2, V H 4 and V H 6 may simply be good 
enough to be tolerated by the immune system. For those biomedical or biotechnological 
applications where it is not good enough, however, we have now provided a pathway to 
improve these properties in a straightforward way. 
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Tables 



Table 1. Summary of biophysical characterization of isolated V H and V L domains 

human 
family 



rTTRl soluble yield oligomeric 
(mg/LOD 55 o=10) state 



midpoint 
[GdnHCl] 



AGn-u m 
(kJmoF 1 ) (kJM^mol" 1 ) 



V H la 


long b 


1.0 


M g 


1.5 


13.7 


10.1 


lb 


long 


1.2 


M 


2.1 


26.0 


12.7 


2 


long 


ref f 


n.d. h 


1.6 


n.d. 


n.d. 


3 


long 


2.4 


M 


3.0 


52.7 


17.6 


3 a 


short 0 


2.1 


n.d. 


2.7 


39.7 


14.6 


4 


long 


ref 


n.d. 


1.8 


n.d. 


n.d. 


■5 


long 


ref 


M 


2.2 


16.5 


7.0 


6 


long 


ref 


n.d. 


0.8 


n.d. 


n.d. 


V L Kl 


K-like d 


4.5 


M 


2.1 


29.0 


14.1 


k2 


K-like 


14.2 


M 


1.5 


24.8 


16.1 


k3 


K-like 


17.1 


M 


2.3 


34.5 


14.8 


k4 


K-like 


9.6 


D,M i 


1.5 


n.d. 


n.d. 


XI 


X-like e 


0.3 


M 


2.1 


23.7 


11.1 


X2 


A,-like 


1.9 


M 


1.0 


16.0 


16.2 


X3 


A,-like 


0.8 


D,M 


0.9 


15.1 


15.9 



b long CDR3, sequence: YNHEADMLIRNWLYSDV 
c short CDR3, sequence: WGGDGFYAMDY 
d K-like CDR3, sequence: QQHYTTPPT 
e X-like CDR3, sequence: QSYDSSLSGW 

f no soluble protein obtained, purification via refolding of inclusion bodies. 

g monomer in 50 mM sodium-phosphate (pH 7.0) and 500 mM NaCl, in case of V H la with 0.9 M GdnHCl 
h not determined 

1 dimer and monomer equilibrium ^ 
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Table 2: Sequence alignment of the human consensus Vh and V L domains at regions 
possibly influencing thermodynamic stability ^ 

charge cluster upper core lower core 



AHo a 


45 


53 


77 


97 


99 


100 


2 


4 


25 


29 


31 


41 


80 


82 


89 


108 


19 


74 


78 


93 


104 


V H 3 


R 


E 


R 


R 


E 


D 


V 


L 


A 


F 


F 


M 


I 


R 


L 


R 


L 


V 


F 


M 


Y 


Vwla 


R 


JL-i 


p 


XV 


E 




V 

V 


T 
i-i 


A 


n. 
vjr 


•p 


T 
1 


T 
1 


A 

A 


A 

A 


T> 
XV 


V 


r 


V 


T 

-L 


Y 


Vwlb 


R 




R 


T? 

XV 


F 


D 


V 

V 


T 


A 


v 


r 


lvl 


lvl 


"p 


A 


T? 
XV 


T 


X* 


V 


T 


-\r 
X 


v H 5 


R 


E 


Q 


K 


s 


D 


V 


L 


G 


Y 


F 


I 


i 


A 


A 


R 


L 


F 


V 


W 


Y 


V H 2 


R 


E 


R 


D 


V 


D 


V 


L 


F 


F 


L 


V 


i 


K 


V 


R 


L 


L 


L 


M 


Y 


V H 4 


R 


E 


R 


T 


A 


D 


V 


L 


V 


G 


I 


F 


i 


V 


F 


R 


L 


L 


V 


L 


Y 


V H 6 


R 


E 


R 


T 


E 


D 


V 


L 


>,1 


D 


V 


F 


i 


P 


F 


R 


L 


V 


I 


L 


Y 


V K 1 


Q 


K 


R 


Q 


E 


D 


I 


M 


A 


Q 


I 


L 


G 


G 


F 


Q 


V 


V 


F 


I 


Y 


V K 2 


L 


Q 


R 


E 


E 


D 


I 


M 


S 


Q 


L 


L 


G 


G 


F 


Q 


A 


V 


F 


I 


Y 


V K 3 


Q 


R 


R 


E 


E 


D 


I 


L 


A 


Q 


V 


L 


G 


G 


F 


Q 


A 


V 


F 


I 


Y 


V K 4 


Q 


K 


R 


Q 


E 


D 


I 


M 


S 


Q 


V 


L 


G 


G 


F 


Q 


A 


V 


F 


I 


Y 


V x l 


Q 


K 


R 


Q 


E 


D 


I 


L 


G 


s 


I 


V 


G 


K 


A 


Q 


V 


V 


F 


I 


Y 


V,2 


Q 


K 


R 


Q 


E 


D 


I 


L 


G 


s 


V 


V 


G 


K 


A 


Q 


I 


V 


F 


I 


Y 


V x 3 


Q 


V 


R 


Q 


E 


D 


I 


L 


G 




L 


A 


G 


N 


A 


Q 


A 


I 


F 


I 


Y 



Numbering according to the structurally based scheme of Hortegger & Pllickthun (2001) 
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Table 3. Key residues of the human Vh family consensus sequences 

Class 



residues defining 
framework I class 



residues differing between well and poorly 
behaved V H domains 



AHo a 




6 


7 


10 


5 


16 


47 


58 


76 


90 


V H 3 


n 


E 


S 


G 


V 


G 


A 




G 


Y 


V H la 


in 


Q 


S 


A 


V 


G 


A 




G 


Y 


V H lb 


in 


Q 


s 


A 


V 


G 


A 




G 


Y 


V H 5 


ni 


Q 


s 


A 


V 


G 


M 




G 


Y 


V H 2 


i 


E 


s 


P 


K 


T 


P 




T 


V 


V H 4 


i 


E 


s 


P 


Q 


S 


P 




S 


S 


V H 6 


m 


Q 


s 


P 


Q 


s 


S 


T 


S 


s 



1 Numbering according to the structurally based scheme of Honegger & Pluckthun (2001) 
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Table 4. Sequence alignment of the human consensus V L families 



AHo a 


12 


18 


138 


146 


148 


149 


V K 1 


S 


R 


T 


E 


K 


R 


V K 2 


P 


P 


T 


E 


K 


R 


V K 3 


S 


R 


T 


E 


K 


R 


V K 4 


A 


R 


T 


E 


K 


R 


V x l 


S 


R 


V 


T 


L 


G 


V x 2 


s 


S 


V 


T 


L 


G 


V„3 


s 


T 


V 


T 


L 


G 



a Numbering according to the structurally based scheme of Honegger & Pluckthun (200 1) 
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Table 5. Summary of biophysical characterization of scFv fragments 



scFv 


CDR3 


soluble 
yield 0 


insoluble 
content (%) 


oligomeric 
state d 


midpoint [GdnHCl] (M) 

v H e v L e 


Hlaic3 


short / K-like a 


11.1 (1.7) 


10 


m, D, M 


1.8 




2.8 


HlbK3 


short / K-like 


12.4 (1.9) 


20 


M 


2.4 




3.0 


H2k3 


short / K-like 


2.6 (0.6) 


90 


M 


1.5 




2.8 


H3k3 


short / K-like 


6.5 (= 1) 


30± 10 


M 




2.8 f 




H4k3 


short / K-like 


2.6 (0.4) 


90 


M 


2.3 




3.0 


H5k3 


<shrn"t / it-IiVp 


6.5 (1.0) 


50 


M 


2.2 




3.0 


H6k3 

JL J. w IV _/ 


^hoTt / it~1iVp 

OJJLVJJ. L / IV IJAv 


5.2 (0.8) 


80 


M 

J.VJL 


1.2 




2.6 


H3k1 


short / K-like 


2.6 (0.4) 


50 


M 




2.8 f 




H3k2 


short / K-like 


2.6 (0.4) 


20 


M 


2.9 




1.6 


H3k3 


short / K-like 


6.5 (= 1) 


30 ± 10 


M 




2.8 f 




H3k4 


short / K-like 


5.2 (0.8) 


40 


M 


2.8 




2.0 


H3M 


short / A,-like b 


7.8(1.2) 


40 


D, M 




3.0 f 




H3X2 


short / A,-like 


5.9 (0.9) 


10 


D, M 




2.9 f 




H3X3 


short / X-like 


3.9 (0.6) 


10 


D,M 




2.8 f 





a sequence of H-CDR3 (short, WGGDGFYAMDY) / L-CDR3 (K-like: QQHYTTPPT) 

b sequence of H-CDR3 (short, WGGDGFYAMDY) / L-CDR3 (A,-like: QSYDSSLSGW) 

c given in mg per 1 L bacteria at OD 550 of 10, and compared to in parenthesis to the soluble yield of H3k3 

d oligomeric state in 50 mM sodium-phosphate (pH 7.0) and 500 mM NaCl with M = monomer; D = dimer; 

m = multimer. : 

e within the scFv fragment 

f only one transition is visible 
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Table 6. Framework usage in vivo and in vitro 



Framework usage of 





Human 


germline 


137 binders from 


Theoretical distribution 


250 binders from 




family 


segments 


onintns library 


Af Uiif 1 AT c 

Ot xiUU/VL 




V H 


la and lb 


24 % g 


13 % 


12% 


16% 




2 


6% 


0% 


9% 


22% 




3 


43 % 


74% 


10% 


36 % 




4 


22 % 


11 % 


19 % 


1 % 




5 


4 % 


1 % 


18% 


13 % 




6 


2% f 


0% 


32% 


12% 


V L 


Kl 


25% 


7% 


16 % 


13 % 




k2 


12% 


47% 


16% 


5% 




k3 


9% 


2% 


16% 


17% 




k4 


1 % f 


0% 


16% 


12% 




XI 


9% 


28% 


12% 


13% 




X2 


8% 


4% 


12% 


11 % 




X3 


14% 


9% 


12% 


28% 




other 


26% 


2% 







a Taken from VB ASE; 5 1 human germline segments for V H and 76 for V L . 

b Taken from Griffiths et al. s (1994), originally 215 binders were sequenced but there are only 137 unique sequences. 
The Griffiths library is built from an in vitro rearranged germline bank, therefore the theoretical distribution is given 
by the percentage of germline segment, present in the human genome, as given in column 3 . 

c Theoretical distribution is corrected for size of sublibaries and percentage of correct clones in the original HuCAL-1 
scFv library (Knappik et al. 5 (2000). 
d Taken from (Knappik et aL 5 (2000). 
g including DP-21 (V H 7) 

f one germline segment 
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Table 7: Summary of yield and stability measurements 



Yield: Stability: 
normalized to wt a AAQn-u (kJ / mol) b 



name 


aDDreviduon 


9P9 






\JJLjJ 


wt 




= 1 


= 1 


= 0 


= 0 


Q5V 


a 


1.7 


2.6 


2.4 


2.9 




b 


1 Q 
I.O 


Z. D 


A 9 
O.Z 


/ .3 


rpf QT 


c 


1 n 


n n 

u.y 




O.o 


V79*n 
V / Z,LJ 


u 




1 8 
i .o 


0 1 




S76G 


e 


2.1 


1.5 


3.7 


3.5 


S90Y 


f 


1.3 


1.8 


-0.1 


1.4 




ab 


1.8 


3.5 


9.8 (8.6) c 


n.d. d 




ce 


1.4 


1.4 


10.4 (11.6) 


n.d. 




abce 


2.3 


3.1 


18.9 (19.6) 


n.d. 




abode 


3.3 


3.7 


19.5 (19.7) 


n.d. 


all 


abcdef 


4.3 


4.2 


20.9 (19.6) 


n.d. 


P10A 


g 


2.9 


4.2 


0.0 


0.0 


P10A + V74F 


gh 


1.9 


1.7 


0.5 


0.4 


all + PlOA 


abcdefg 


3.5 


2.1 


16.8 (19.6) 


n.d. 



3 yield of soluble protein after IMAC and ion-exchange column, normalized to yield of the respective wild-type 
scFv fragments 2C2 and 6B3. Absolute values: 2C2-wt: 1.2 ± 0.1 mg and 6B3-wt: 0.4 ±0.1mg per 1 L 
bacterial culture of an OD 550 of 10. 

b Absolute values of free energy of unfolding of wild-type scFv fragments: 2C2-wt: AG N _u = 51.3 kJ / mol and 
6B3-wt: AG N _u = 42.4 kJ / mol 

c in parentheses sum of the free energy contributions of the individual mutations to equilibrium stability 
d not determined because of low cooperativity (see text for details) 
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Table 8. Analysis of framework-1 subtype 

subtype-defining residues a subtype-correlated core residues a 



name 


subtype 


H6 b 


H7 


H10 


H19 


H74 


H78 


H93 




I 


Glu 


Ser 


Pro 


Leu 


Leu 


Ala/Val/Ile/Leu 


Leu/Met 




II 


Glu 


Ser 


Gly 


Leu 


Val 


Phe 


Met 




in 


Gin 


Ser 


any (Ala) 0 


Leu/Val 


Phe 


Ala/Val 


Leu 


wt 


ni 


Gln(100%) d 


Ser (84 %) 


Pro (8 %) 


Leu (56 %) He (1 %) 


lie (8 %) 


Leu (63 %: 


P10A 


m 


Gin 


Ser 


Ala 


Leu 


He 


lie 


Leu 


P10A 


m 


Gin 


Ser 


Ala 


Leu 


Phe 


He 


Leu 


I74F 













a according to ref. (32) 

b using the numbering scheme of Honegger & Pluckthun (33) 
c Ala is used in 76 % of subtype III sequences (32) 

d percentage use of specified amino acid in subtype III sequences, regardless of V H family (32) 
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1. An isolated polypeptide comprising a V H domain selected from the group consisting 
of (i) a V H domain belonging to the V H la subclass, wherein said V H domain comprises 
an amino acid residue F at position 29 and/or L at position 89; (ii) a V H domain 
belonging to the V H lb subclass, wherein said V H domain comprises the amino acid 
residue L at position 89; (iii) a V H domain belonging to the V H 2 subclass, wherein said 
V H domain comprises at least one amino acid residue selected from the group 
consisting of G at position 16, V at position 44, A at position 47, G at position 76, F at 
position 78, Y at position 90, R at position 97, E at position 99, wherein if R is at 
position 97, then E is at position 99; (iv) a V H domain belonging to the V H 4 subclass, 
wherein said V H domain comprises at least one amino acid residue selected from the 
group consisting of G at position 16, A at position 47, F at position 78, Y at position 
90, R at position 97, and E at position 99, wherein if R is at position 97, then E is at 
position 99; (v) a V H domain belonging to the V H 5 subclass, wherein said V H domain 
comprises at least one amino acid residue selected from the group consisting of L at 
position 89, R at position 97, and E at position 99, wherein if R is at position 97, then 
E is at position 99; and (vi) a V H domain belonging to the V H 6 subclass, wherein said 
V H domain comprises at least one amino acid residue selected from the group 
consisting of V at position 5, G at position 16, I at position 58, F at position 78, Y at 
position 90 and R at position 97, and E at position 99, wherein if R is at position 97, 
then E is at position 99. 
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2. An isolated polypeptide according to claim 1, comprising a Vh domain belonging to 
the Vnla subclass, wherein said V H domain comprises an amino acid residue F at 
position 29 and/or L at position 89. 



3. An isolated polypeptide according to claim 1, comprising a V H domain belonging to 
the Vnlb subclass, wherein said Vh domain comprises the amino acid residue L at 
position 89. 

4. An isolated polypeptide according to claim 1, comprising a V H domain belonging to 
the Vh2 subclass, wherein said V H domain comprises at least one amino acid residue 
selected from the group consisting of G at position 16, V at position 44, A at position 
47, G at position 76, F at position 78, Y at position 90, R at position 97, E at position 
99, wherein if R is at position 97, then E is at position 99. 

5. An isolated polypeptide according to claim 1, comprising a V H domain belonging to 
the Vh4 subclass, wherein said Vh domain comprises at least one amino acid residue 
selected from the group consisting of G at position 16, A at position 47, F at position 
78, Y at position 90, R at position 97, and E at position 99, wherein if R is at position 
97, then E is at position 99. 



6. An isolated polypeptide according to claim 1, comprising a Vh domain belonging to 
the V H 5 subclass, wherein said Vh domain comprises at least one amino acid residue 
selected from the group consisting of L at position 89, R at position 97, and E at 
position 99, wherein if R is at position 97, then E is at position 99. 
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7. An isolated polypeptide according to claim 1, comprising a V H domain belonging to 
the V H 6 subclass, wherein said V H domain comprises at least one amino acid residue 
selected from the group consisting of V at position 5, G at position 16, 1 at position 58, 
F at position 78, Y at position 90 and R at position 97, and E at position 99, wherein if 
R is at position 97, then E is at position 99. 

8. An antibody or functional fragment thereof comprising a V H domain according to 
claim 1. 



9. A library of antibodies or functional fragments thereof comprising one or more 
antibodies or functional fragments thereof according to claim 8. 

10. An isolated nucleic acid sequence encoding a polypeptide selected from the group 
consisting of (i) a polypeptide comprising a V H domain belonging to the V H la 
subclass, wherein said V H domain comprises an amino acid residue F at position 29 
and/or L at position 89; (ii) a polypeptide comprising a V H domain belonging to the 
V H lb subclass, wherein said V H domain comprises the amino acid residue L at 
position 89; (iii) a polypeptide comprising a V H domain belonging to the V H 2 
subclass, wherein said V H domain comprises at least one amino acid residue selected 
from the group consisting of G at position 16, V at position 44, A at position 47, G at 
position 76, F at position 78, Y at position 90, R at position 97, E at position 99, 
wherein if R is at position 97, then E is at position 99; (iv) a polypeptide comprising a 
V H domain belonging to the V H 4 subclass, wherein said V H domain comprises at least 
one amino acid residue selected from the group consisting of G at position 16, A at 
position 47, F at position 78, Y at position 90, R at position 97, and E at position 99, 
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wherein if R is at position 97, then E is at position 99; (v) a polypeptide comprising a 
Vh domain belonging to the V H 5 subclass, wherein said V H domain comprises at least 
one amino acid residue selected from the group consisting of L at position 89, R at 
position 97, and E at position 99, wherein if R is at position 97, then E is at position 
99; and (vi) a polypeptide comprising a V H domain belonging to the V H 6 subclass, 
wherein said V H domain comprises at least one amino acid residue selected from the 
group consisting of V at position 5, G at position 16, 1 at position 58, F at position 78, 
Y at position 90 and R at position 97, and E at position 99, wherein if R is at position 
97, then E is at position 99. 

11. A vector comprising a nucleic acid sequence corresponding to the nucleic acid 
sequence according to claim 10. 

12. A host cell harboring a nucleic acid sequence corresponding to the nucleic acid 
sequence according to claim 10. 

13. A method for producing a V H domain or an antibody or a functional fragment thereof 
comprising the step of expressing an isolated nucleic acid sequence according to claim 
10. 

14. A method for obtaining an isolated nucleic acid sequence, comprising the step of (i) 
substituting, in a nucleic acid sequence that encodes a V H la subclass domain, at least 
one codon that encodes an amino acid residue selected from the group consisting of F 
at position 29 and L at position 89; or (ii) substituting, in a nucleic acid sequence that 
encodes a V H lb subclass domain, a codon that encodes the amino acid residue L at 
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position 89; or (iii) substituting, in a nucleic acid sequence that encodes a V H 2 
subclass domain, at least one codon that encodes an amino acid residue selected from 
the group consisting of G at position 16, V at position 44, A at position 47, G at 
position 76, F at position 78, R at position 97, and E at position 99, wherein if R is at 
position 97, then E is at position 99; or (iv) substituting, in a nucleic acid sequence 
that encodes a V H 2 subclass domain, a codon that encodes the amino acid residue Y at 
position 90; or (v) substituting, in a nucleic acid sequence that encodes a V H 4 subclass 
domain, at least one codon that encodes an amino acid residue selected from the group 
consisting of G at position 16, V at position 44, A at position 47, G at position 76, F at 
position 78, R at position 97, and E at position 99, wherein if R is at position 97, then 
E is at position 99; or (vi) substituting, in a nucleic acid sequence that encodes a V H 4 
subclass domain, a codon that encodes the amino acid residue Y at position 90; or (vii) 
substituting, in a nucleic acid sequence that encodes a V H 5 subclass domain, at least 
one codon that encodes an amino acid residue selected from the group consisting of R 
at position 77, L at position 89, R at position 97, and E at position 99, wherein if R is 
at position 97, then E is at position 99; or (viii) substituting, in a nucleic acid sequence 
that encodes a V H 6 subclass domain, at least one codon that encodes an amino acid 
residue selected from the group consisting of V at position 5, G at position 16, V at 
position 44, I at position 58, D at position 72, G at position 76, F at position 78, R at 
position 97, and E is at position 99, wherein if R is at position 97, then E is at position 
99; or (ix) substituting, in a V H 6 subclass domain, a codon that encodes the amino acid 
residue Y at position 90. 

15. A method according to claim 14, wherein 2 or more codons are substituted in said 
nucleic acid sequence. 
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16. A method according to claim 14, further comprising the steps of: 

(i) identifying for said domain the corresponding amino acid consensus 
sequence selected from the group of V H consensus sequences consisting of 
V H la, V H lb, V H 2, V H 4, V H 5, and V H 6 ; 

(ii) substituting one or more codons corresponding to amino acid residues of 
said consensus sequence into a corresponding position(s) in said nucleic 
acid sequence of said domain. 

17. A method of obtaining a polypeptide, comprising the step of expressing a nucleic acid 
sequence according to claim 14. 

18. A method for constructing a library of antibodies or functional fragments thereof, 
comprising the steps of: (i) obtaining at least one nucleic acid sequence according to 
claim 14; and (ii) diversifying said obtained nucleic acid sequence to generate a 
population of diversified nucleic acid sequences, wherein said diversified nucleic acid 
sequences can be expressed for generating and screening of antibody libraries 
comprising diversified VH domains. 

19. An isolated polypeptide. comprising a V L domain selected from the group consisting of 
(i) a V L domain belonging to the V l k2 subclass, wherein said V L domain comprises 
the amino acid residue R at position 18, and wherein if R is at position 18, then T is at 
position 92; and (ii) a V L domain belonging to the VlM subclass, wherein said V L 
domain comprises the amino acid residue K at position 47. 
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20. An isolated polypeptide according to claim 19, comprising a V L domain belonging to 
the V l k2 subclass, wherein said V L domain comprises the amino acid residue R at 
position 18, and wherein if R is at position 18, then T is at position 92. 

21. An isolated polypeptide according to claim 19, comprising a V L domain belonging to 
the V L M subclass, wherein said V L domain comprises the amino acid residue K at 
position 47. 



22. An antibody or a functional fragment thereof comprising a V L domain according to 
claim 19. 

23. A library of antibodies or functional fragments thereof comprising one or more 
antibodies or functional fragments thereof according to claim 22. 

24. An isolated nucleic acid molecule encoding a polypeptide selected from the group 
consisting of (i) a polypeptide comprising a V L domain belonging to the V l k2 
subclass, wherein said V L domain comprises the amino acid residue R at position 18, 
and wherein R is at position 18, then T is at position 92; and (ii) a polypeptide 
comprising a V L domain belonging to the V L A,1 subclass, wherein said V L domain 
comprises the amino acid residue K at position 47. 

25. A vector comprising a nucleic acid sequence corresponding to the nucleic acid 
sequence according to claim 24. 
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26. A host cell harbouring a nucleic acid sequence molecule corresponding to the nucleic 
acid sequence according to claim 24. 

27. A method for producing a V L domain or an antibody or a functional fragment thereof 
comprising the step of expressing an isolated nucleic acid sequence according to claim 
24. 

28. A method for obtaining a nucleic acid sequence, comprising the step of (i) 
substituting, in a nucleic acid sequence that encodes a Vlk2 subclass domain, at least 
one codon that encodes an amino acid residue selected from the group consisting of S 
at position 12, Q at position 45, and R at position 18, and wherein if R is at position 
18, then T is at position 92; or (ii) substituting, in a nucleic acid sequence that encodes 
a VlM subclass domain, at least one codon that encodes the amino acid residue K at 
position 47; or (iii) substituting, in a nucleic acid sequence that encodes a VjjU 
domain, at least three codons that encode the amino acid residues S at position 7, P at 
position 8, and S at position 9, respectively; or (iv) substituting, in a nucleic acid 
sequence that encodes a Vl^2 domain, at least three codons that encode the amino 
acid residues S at position 7, P at position 8, and S at position 9, respectively; or (v) 
substituting, in a nucleic acid sequence that encodes a Vi>3 domain, at least three 
codons that encode the amino acid residues S at position 7, P at position 8, and S at 
position 9, respectively. 

29. A method according to claim 28, wherein 2 or more codons are substituted in said 
nucleic acid sequence. 
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30. A method according to claim 28, further comprising the steps of: 

(i) identifying for said domain the corresponding amino acid consensus 
sequence selected from the group of Vl consensus sequences consisting of 
V l k2, VlM V l ^2, and V L X3; and 

(ii) substituting one or more codons corresponding to amino acid residues of 
said consensus sequence into a corresponding position(s) in said nucleic 
acid sequence of said domain. 

31. A method of obtaining a polypeptide, comprising the step of expressing a nucleic acid 
sequence according to claim 24. 

32. A method for constructing a library of antibodies or functional fragments thereof, 
comprising the steps of: (i) obtaining at least one nucleic acid sequence according to 
claim 24; and (ii) diversifying said obtained nucleic acid sequence to generate a 
population of diversified nucleic acid sequences, wherein said diversified nucleic acid 
sequences can be expressed for generating and screening of antibody libraries 
comprising said diversified VH domains. 



33. An antibody or a functional fragment thereof comprising (i) a polypeptide of claim 1 
and a polypeptide of claim 19. 
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Figure 6 
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